Computational Social Choice Lirong Xia WINE-13 Tutorial Dec 11, 2013 2011 UK Referendum • The second nationwide referendum in UK – 1st was in 1975 • Member of.

Transcript Computational Social Choice Lirong Xia WINE-13 Tutorial Dec 11, 2013 2011 UK Referendum • The second nationwide referendum in UK – 1st was in 1975 • Member of.

Slide 1

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 2

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 3

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 4

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 5

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 6

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 7

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 8

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 9

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 10

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 11

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 12

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 13

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 14

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 15

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 16

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 17

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 18

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 19

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 20

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 21

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 22

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 23

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 24

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 25

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 26

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 27

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 28

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 29

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 30

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 31

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 32

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 33

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 34

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 35

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 36

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 37

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 38

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 39

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 40

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 41

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 42

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 43

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 44

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 45

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 46

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 47

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 48

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 49

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 50

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 51

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 52

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 53

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 54

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 55

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 56

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 57

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 58

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 59

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 60

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 61

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 62

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 63

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 64

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 65

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 66

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 67

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 68

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 69

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 70

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 71

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 72

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 73

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 74

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 75

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 76

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 77

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 78

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 79

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 80

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 81

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 82

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 83

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 84

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 85

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 86

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 87

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 88

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 89

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 90

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 91

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 92

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 93

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Slide 94

Computational Social Choice

Lirong Xia

WINE-13 Tutorial
Dec 11, 2013

2011 UK Referendum
• The second nationwide
referendum in UK
– 1st was in 1975

• Member of Parliament election:
Plurality rule  Alternative vote rule?

• 68% No vs. 32% Yes

1

Ordinal Preference Aggregation: Social Choice
A profile

Alice

Bob

Carol

A

>

B

>

C

A

>

B

>

C

B

>

C

>

social choice
mechanism

A

A

2

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
3

Social choice
Profile
R1

R1*

R2

R2*

social choice mechanism

Outcome

…

…
Rn

Rn*

Ri, Ri*: full rankings over a set A of alternatives

4

Social
andChoice
Computer Science
Computational thinking + optimization algorithms

CS
Social
Choice

21th Century

Strategic thinking + methods/principles of aggregation
PLATO
LULL
PLATO et13
al.thC.
4thC. B.C.
4thC. B.C.---20thC.

BORDA
18thC.

CONDORCET
ARROW
TURING et al.
18thC.20thC.
20thC.

5

Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision

6

Applications: academic world
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]

• etc.
7

A burgeoning area
• Recently has been drawing a lot of attention
– IJCAI-11:
– AAAI-11:
– AAMAS-11:

15 papers, best paper
6 papers, best paper
10 full papers, best paper runner-up

– AAMAS-12
– EC-12:

9 full papers, best student paper
3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses:
– Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice
8

How to design a good social
What
is
being
“good”?
choice mechanism?

9

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
10

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

NPHard

30 min
30 min

2.2 Computational aspects
Part 2
NPHard

5 min
75 min

3. Statistical approaches
NPHard

11

Common voting rules
(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule)
is a mapping from {All profiles} to {outcomes}
– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile

• Positional scoring rules
• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]

An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,

c1  c2  c3
2

1

0

c2  c1  c3
2

1

0

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1

c3  c1  c2
2

1

0

Plurality with runoff
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins

• [used in Iran, France, North Carolina State]
a > b > c > d dd >> aa > b > c c > d > a >b
10

7

6

d

b > c > dd >a
>a
3
14

Single transferable vote (STV)
• Also called instant run-off voting or
alternative vote

• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner

• [used in Australia and Ireland]
a > b > cc >> dd dd >> aa >> b > c c > d > aa >b
10

7

6

a

b > c > d >aa
3
15

The Kemeny rule
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}

K( b ≻ c ≻ a , a ≻ b ≻ c ) = 12
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked
alternative in Kemeny(D)
• [Has a statistical interpretation]
16

…and many others
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…

17

• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach

18

Axiomatic approach
(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter
– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked
alternative is always the winner
– Fairness for the voters

• Neutrality: names of the alternatives do not matter
– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner,
then it must win
– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P
– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is
hard
19

Which axiom is more important?
Condorcet
consistency

Consistency

Easy to compute

Positional
scoring rules

N

Y

Y

Kemeny

Y

N

N

Ranked pairs

Y

N

Y

• Some of these axiomatic properties are not
compatible with others
• Food for thought: how to evaluate partial
satisfaction of axioms?
20

An easy fact
• Theorem. For voting rules that selects a single
winner, anonymity is not compatible with
neutrality
– proof:
Alice

>

>

Bob

>

>

W.O.L.G.

≠
Anonymity

Neutrality
21

Another easy fact
[Fishburn APSR-74]

• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3

>

>

is the Condorcet winner

2 Voters

>

>

: 3s1 + 2s2 + 2s3

1 Voter

>

>

: 3s1 + 3s2 + 1s3

1 Voter

>

>

<

3 Voters

22

Not-So-Easy facts
• Arrow’s impossibility theorem
– Google it!

• Gibbard-Satterthwaite theorem
– Next section

• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2  if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity)  positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency)  Kemeny
[Young&Levenglick SIAMAM-78]
23

Food for thought
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]

24

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
25

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

26

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

27

Which rule is easy to compute?
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete
[Hemaspaandra et al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]
28

Really easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m

• What if m is extremely large?

29

Combinatorial domains
(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={

} ×{

}
30

Multiple referenda
• In California, voters voted on 11 binary issues (
/
)
– 211=2048 combinations in total
– 5/11 are about budget and taxes
• Prop.30 Increase sales
and some income tax
for education
• Prop.38 Increase
income tax on almost
everyone for education

31

Other combinatorial domains
• Belief merging [Gabbay et al. JLC-09]
K1

merging operator

K2

…

Kn

• Judgment aggregation [List and Pettit EP-02]
Action P

Action Q

Liable? (P∧Q)

Judge 1

Y

Y

Y

Judge 2

Y

N

N

Judge 3

N

Y

N

Majority

Y

Y

N
36

Computational axioms
• Easy to compute:
– the winner can be computed in polynomial time

• Hard to manipulate:
– computing a beneficial false vote is hard

37

Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never
a (beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties

Manipulation under plurality rule
(ties are broken in favor of
>

>

Alice

>

>

Bob

>

>

Carol

>

>

)

Plurality rule

Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard
Econometrica-73, Satterthwaite JET-75]: When there are
at least three alternatives, no voting rules except
dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear
order as their votes
– strategy-proofness

• Axiomatic characterization for dictatorships!

A few ways out
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative

41

Computational thinking
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic
for more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where
– Winner determination is easy
– Manipulation is hard
42

Overview
Manipulation is inevitable
(Gibbard-Satterthwaite Theorem)
Can we use computational complexity as a barrier?
Why prevent manipulation?

Yes
Is it a strong barrier?

No

May lead to very
undesirable outcomes
How often?

Other barriers?
Limited information
Limited communication

Seems not very often
43

Manipulation: A computational
complexity perspective
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
NPHard

For which common voting rules
manipulation is computationally hard?

44

Computing a manipulation
• Initiated by [Bartholdi, Tovey, &Trick

SCW-89b]

• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang
JACM-07]

– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of
manipulators

• Unbounded number of alternatives (next few
slides)
• Assuming the manipulators have complete
information!

45

Unweighted coalitional manipulation
(UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a
profile PM (of the manipulators) such that c is
the winner of PNM∪PM under r
46

The stunningly big table for UCM
#manipulators
Copeland
STV

One manipulator
P [BTT SCW-89b]
NPC [BO SCW-91]

At least two
NPC [FHS AAMAS-08,10]
NPC [BO SCW-91]

Veto

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Plurality with runoff

P [ZPR AIJ-09]

P [ZPR AIJ-09]

Cup

P [CSL JACM-07]

P [CSL JACM-07]

Borda

P [BTT SCW-89b]

NPC

Maximin

P [BTT SCW-89b]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

NPC [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

P [XZP+ IJCAI-09]

Ranked pairs
Bucklin

[DKN+ AAAI-11]
[BNW IJCAI-11]

Nanson’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]

Baldwin’s rule

NPC [NWX AAA-11]

NPC [NWX AAA-11]
47

What can we conclude?
• For some common voting rules,
computational complexity provides some
protection against manipulation

• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept

48

Probably NOT a strong barrier
1. Frequency of
manipulability

2. Easiness of
Approximation

3. Quantitative G-S

AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
49

A first angle:
frequency of manipulability
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)

• How often can the manipulators make c
win?
– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]
50

A general result

[Xia&Conitzer EC-08a]

• Theorem. For any generalized scoring rule
– Including many common voting rules

All-powerful
# manipulators

Θ(√n)

No power
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most
cases
– The case of Θ(√n) has been studied experimentally in
[Walsh IJCAI-09]
51

A second angle: approximation
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]

52

An approximation algorithm for
positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
• A polynomial-time approximation algorithm
that works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for
positional scoring rules and a class of scheduling
problems

• Computational complexity is not a strong
barrier against manipulation
– The cost of successful manipulation can be
easily approximated (for positional scoring rules)
53

The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done
in unit time)

• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted
(and resume later maybe on another machine)

• We are asked to compute the minimum
makespan
– the minimum time to complete all jobs

54

Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
=

c

V1

PNM ∪{V1=[c>c1>c2>c3]}
p

c
∨

c1 (J1) p

p1 –p-(s
p1 p11-s-p2)

s1=s
s1-s
1-s
22

c1
∨

c2 (J2) p

p2 –p-(s
p21p-s2 4-p
)

s2=s
s1-s
1-s
33

c3
∨

c3 (J3) p

p3 –p-(s
p3 p1-s
3 -p
3)

s3s=s
1-s14-s4

c2
55

The approximation algorithm
Scheduling
problem

Original UCO

No more than
OPT+m-2

[Gonzalez&Sahni
JACM 78]

Solution to the
UCO

Solution to the
scheduling problem
Rounding
56

Complexity of UCM for Borda
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
57

A third angle: quantitative G-S
• G-S theorem: for any reasonable voting rule
there exists a manipulation
• Quantitative G-S: for any voting rule that is
“far away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut,
Kalai, &Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer
EC-08b, Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]
58

Next steps
• The first attempt seems to fail

• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
59

Limited information
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]

• Bayesian information [Lu et al. UAI-12]

60

Limited communication among manipulators
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote
W that
• No matter how many followers there are, the
leader/potential followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]
61

Other types of strategic behavior
(of the chairperson)
• Procedure control by
– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting
– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of
these for generalized scoring rules

69

Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization
[Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic behavior?

• Practical ways to protect elections
70

Outline
45 min

1. Classical Social Choice

5 min
55 min

2.1 Computational aspects
Part 1

15 min
30 min

2.2 Computational aspects
Part 2

5 min
75 min

3. Statistical approaches
71

Ranking pictures [PGM+ AAAI-12]
...
.. .
.
A

A > B > C
Turker 1

.
.
..
.

>

.. . ..
.
. . ..
.
B

B > A
Turker 2

>

. ..
.
C

…

B > C
Turker n
72

Two goals for social choice mechanisms
GOAL1: democracy

GOAL2: truth

1. Classical Social Choice
3. Statistical approaches
2. Computational aspects
73

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
74

The Condorcet Jury theorem
[Condorcet 1785]
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth
75

Condorcet’s MLE approach
• Parametric ranking model Mr: given a “ground truth”
parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

“Ground truth” Θ
P1

P2

…

Pn

– Each P is a ranking

• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly

• What if Decision space ≠ Parameter space?

76

Condorcet’s model
• Condorcet was not very clear how the Condorcet Jury theorem
can be extended to m>2
• Young had an interpretation [Young APSR-1988]

• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1

• Sample space
– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
p
c≻d in V

c≻d in W
1-p

d≻c in V

77

Mallows model [Mallows 1957]
• Parameter space
– all rankings over candidates
– ϕ<1

• Sample space
– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule
78

Recent studies on Condorcet/Mallows model
• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules
[Caragiannis, Procaccia & Shah EC-13]

79

Outline: statistical approaches
Condorcet/Mallows model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

80

Statistical decision framework
[Azari, Parkes, and Xia WINE poster]
Decision
(winner, ranking, etc)
Given Mr

Mr

Step 2: decision making

Information about the
ground truth

ground
truth Θ

Step 1: statistical inference
P1

……

Pn

P1

P2

…

Pn

Data D
81

Example: Kemeny
Winner

Decision space:
A unique winner

Step 2: top-1 alternative

The most probable ranking

Mr = Mallows model
Step 1: MLE

Step 1: MLE
Step 2: top-alternative

P1

P2

…

Pn

Data D
82

Likelihood reasoning vs. Bayesian in general
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails

Credit: Panos Ipeirotis
& Roy Radner

– Do you think the next two tosses will be two heads in a row?

• Likelihood reasoning

• Bayesian

– there is an unknown
but fixed ground truth

– the ground truth is
captured by a belief
distribution

– p = 10/14=0.714

– Compute Pr(p|Data)
assuming uniform prior

– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!

– Compute
Pr(2heads|Data)=0.485
<0.5
– No!

83

Kemeny = Likelihood approach
Winner

Step 2: top-1 alternative

Mr = Mallows model
The most probable ranking

This is the Kemeny rule
(for single winner)!

Step 1: MLE
P1

P2

…

Pn

Data D
84

Kemeny = Likelihood approach (2)
Winner

Mr = Condorcet model
Step 1: compute the
likelihood for all parameters
(opinions)

Step 2: choose the topalternative of the most
probable ranking

Step 2: top-1 alternative

The most probable ranking

Step 1: compute the likelihood
P1

P2

…

Pn

Data D
85

Example: Bayesian [Young APSR-88]
Winner

Step 2: mostly likely top-1

Mr = Condorcet’ model
Posterior over rankings

Step 1: Bayesian update
P1

P2

…

Pn

Data D
86

Likelihood vs. Bayesian
[Azari, Parkes, and Xia WINE poster]
Anonymity,
neutrality,
monotonicity

Consistency

Likelihood
(Mallows)
Bayesian
(Condorcet)

Y

Condorcet

Easy to
compute

Y

N

N

Y

N

Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory

87

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

88

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring rules
are NOT MLEs

• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75] 89

Classical voting rules as MLEs
[Conitzer&Sandholm UAI-05]

• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional
scoring rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!
– Kemeny is the only preference function (that
outputs rankings) that satisfies neutrality,
reinforcement, and Condorcet consistency
[Young&Levenglick SIAMAM-78]
90

Are we happy?
• Condorcet’s model
– not very natural
– computationally hard

• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
91

New mechanisms via the statistical
decision framework
Decision

Model selection
– How can we evaluate fitness?

• Likelihood or Bayesian?
– Focus on MLE

decision making
Information about the
ground truth

inference
Data D

• Computation
– How can we compute MLE efficiently?
92

Why not just a problem of
machine learning or statistics?
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)

93

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

94

Random utility model (RUM)
[Thurstone 27]
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
θ3

U3

θ2

θ1

U1 U2

95

Generating a preference-profile
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
Parameters

θ3

Agent 1
P1= c2≻c1≻c3

θ2

…

θ1

Agent n
Pn= c1≻c2≻c3
96

RUMs with Gumbel distributions
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm
Pr(c1

c2

• Pros:

cm | l1

lm ) =

l1 +

l1

+ lm

´

l2 +

l2

+ lm

´

lm-1
´
lm-1 + lm

c21 is the
cm-1top
is preferred
choice in to
{ cc21,…,c
,…,c
m
m}

– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11],
and analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well

97

RUM with normal distributions
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]

• Pros:
– Intuitive
– Flexible

• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
Pr(c1

cm | Q) =

Um: from -∞ to ∞

ò ò
¥

¥

-¥

Um

ò

¥
U2

mm (Um )mm-1 (Um-1 ) m1 (U1 )dU1 dUm-1 dUm

Um-1: from Um to ∞ …

U1: from U2 to ∞

98

Unimodality of likelihood
[APX. NIPS-12]
• Location family: RUMs where each μi is
parameterized by its mean θi
– Normal distributions with fixed variance
– P-L

• Theorem. For any RUM in the location family, if
the PDF of each μi is log-concave, then for any
preference-profile D, the likelihood function
Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex

99

MC-EM algorithm for RUMs
[APX NIPS-12]
• Utility distributions μl’s belong to the exponential
family (EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel,
etc.

• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step

Approximately computed
by Gibbs sampling

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε
100

Outline: statistical approaches
Condorcet’s MLE model
(history)

Why MLE?

Why Condorcet’s
model?

A General framework

Random Utility Models

Model selection
101

Model selection
• Compare RUMs with Normal distributions and
PL for
– log-likelihood

– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)

• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Value(Normal)
- Value(PL)

LL

Pred. LL

AIC

BIC

44.8(15.8)

87.4(30.5)

-79.6(31.6)

-50.5(31.6)

Red: statistically significant with 95% confidence
102

Recent progress
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features

• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)
103

2. Computational aspects

3. Statistical approaches

• Easy-to-compute axiom
• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis

• Framework based on
statistical decision theory
• Model selection
• Condorcet/Mallows vs. RUM

Computational thinking + optimization algorithms

CS
Social
Choice

Thank you!
Strategic thinking + methods/principles of aggregation

Computational Social Choice Lirong Xia WINE-13 Tutorial Dec 11, 2013 2011 UK Referendum • The second nationwide referendum in UK – 1st was in 1975 • Member of.

Transcript Computational Social Choice Lirong Xia WINE-13 Tutorial Dec 11, 2013 2011 UK Referendum • The second nationwide referendum in UK – 1st was in 1975 • Member of.

Directory