aamas11reshefslides.pptx

Download Report

Transcript aamas11reshefslides.pptx

On the Limits of Dictatorial
Classification
Reshef Meir
School of Computer Science and
Engineering, Hebrew University
Joint work with
Shaull Almagor, Assaf Michaely and Jeffrey S. Rosenschein
Strategy-Proof Classification
• An Example
• Motivation
• Our Model and previous results
• Filling the gap: proving a lower bound
• The weighted case
Introduction
Motivation
Model
Results
Strategic labeling: an example
ERM
5 errors
Introduction
Motivation
Model
Results
There is a better
classifier!
(for me…)
Introduction
Motivation
Model
Results
If I just
change the
labels…
2+5 = 7 errors
Introduction
Motivation
Model
Results
Classification
The Supervised Classification problem:
– Input: a set of labeled data points {(xi,yi)}i=1..m
– output: a classifier c from some predefined
concept class C ( e.g., functions of the form f : X{-,+} )
– We usually want c to classify correctly not just the
sample, but to generalize well, i.e., to minimize
R(c) ≡ E(x,y)~D[ c(x)≠y ]
the expected number of errors w.r.t. the distribution D
(the 0/1 loss function)
Introduction
Motivation
Model
Results
Classification (cont.)
• A common approach is to return the ERM
(Empirical Risk Minimizer), i.e., the concept in C
that is the best w.r.t. the given samples (has the
lowest number of errors)
• Generalizes well under some assumptions on
the concept class C (e.g., linear classifiers tend
to generalize well)
With multiple experts, we can’t trust our ERM!
Introduction
Motivation
Model
Results
Where do we find “experts” with incentives?
Example 1: A firm learning purchase patterns
– Information gathered from local retailers
– The resulting policy affects them
– “the best policy, is the policy that fits my pattern”
Introduction
Motivation
Model
Results
Example 2: Internet polls / polls of experts
Users
Reported Dataset
Classifier
Classification
Algorithm
Introduction
Motivation
Model
Results
Motivation from other domains
Aggregating partitions
Judgment aggregation
Agent
A
B
A&B
A | ~B
T
F
F
T
F
T
F
F
F
F
F
T
Facility location (on the binary cube)
Introduction
Motivation
Model
Results
A problem instance is defined by
• Set of agents I = {1,...,n}
• A set of data points
X = {x1,...,xm}  X
• For each xkX agent i has a label yik{,}
– Each pair sik=xk,yik is a sample
– All samples of a single agent compose the labeled dataset
Si = {si1,...,si,m(i)}
• The joint dataset S= S1 , S2 ,…, Sn is our input
– m=|S|
• We denote the dataset with the reported labels by S’
Introduction
Motivation
Model
Results
Input: Example
–
–
+
–
-
+
+
–
+
–
–
–
-
–
+
–
–
+
-
+
+
Agent 1
Agent 2
Agent 3
X  Xm
Y1  {-,+}m
Y2  {-,+}m
Y3  {-,+}m
S = S1, S2,…, Sn = (X,Y1),…, (X,Yn)
Introduction
Motivation
Model
Results
Mechanisms
• A Mechanism M receives a labeled dataset S
and outputs c = M(S) C
• Private risk of i: Ri(c,S) = |{k:
% ofc(x
errors
ik)  yon
ik}|S/i mi
• Global risk: R(c,S) = |{i,k:
yik}|
% of c(x
errors
S /m
ik) on
• We allow non-deterministic mechanisms
– Measure the expected risk
Introduction
Motivation
Model
Results
ERM
We compare the outcome of M to the ERM:
c* = ERM(S) = argmin(R(c),S)
cC
r* = R(c*,S)
Can our mechanism
simply compute and
return the ERM?
Introduction
Motivation
Model
Results
Requirements
MOST
1. Good approximation:
IMPORTANT
S R(M(S),S) ≤ α∙r*
SLIDE
2. Strategy-Proofness (SP):
i,S,Si‘ Ri(M(S-i , Si‘),S) ≥ Ri(M(S),S)
(Lying)
(Truth)
• ERM(S) is 1-approximating but not SP
• ERM(S1) is SP but gives bad approximation
Introduction
Motivation
Model
Results
Related work
• A study of SP mechanisms in Regression learning
– O. Dekel, F. Fischer and A. D. Procaccia, SODA (2008), JCSS (2009).
[supervised learning]
• No SP mechanisms for Clustering
– J. Perote-Peña and J. Perote, Economics Bulletin (2003)
[unsupervised learning]
Introduction
Motivation
A simple case
Model
Results
Previous work
• Tiny concept class: |C|= 2
• Either “all positive” or “all negative”
Theorem:
• There is a SP 2-approximation mechanism
• There are no SP α-approximation mechanisms,
for any α<2
Meir, Procaccia and Rosenschein, AAAI 2008
Introduction
Motivation
Model
Results
Previous work
General concept classes
Theorem: Selecting a dictator at random is SP
and guarantees 3  n2 approximation
– True for any concept class C
– Generalizes well from sampled data when C has a
bounded VC dimension
Open question #1: are there better mechanisms?
Open question #2: what if agents are weighted?
Meir, Procaccia and Rosenschein, IJCAI 2009
Introduction
Motivation
Model
Results
A lower bound
Our main result:
Theorem: There is a concept class C (where |C|=3), for which any
SP mechanism has an approximation ratio of at least 3  n2
o Matching the upper bound from IJCAI-09
o Proof is by a careful reduction to a voting scenario
o We will see the proof sketch
Introduction
Motivation
Model
Results
Proof sketch
Gibbard [‘77] proved that every (randomized) SP voting rule
for 3 candidates, must be a lottery over dictators*.
We define X = {x,y,z}, and C as follows:
x
y
z
cx
+
-
-
cy
-
+
-
cz
-
-
+
We also restrict the agents, so that each agent can have
mixed labels on just one point
x
y
z
--------
++++ - - - -
++++++++
++++++++
--------
++ - - - - - -
Introduction
Motivation
Model
Results
Proof sketch (cont.)
Suppose that M is SP
x
y
z
--------
++++ - - - -
++++++++
++++++++
--------
++ - - - - - -
Introduction
Motivation
Model
Results
Proof sketch (cont.)
x
y
z
--------
++++ - - - -
++++++++
cz > cy > cx
++++++++
--------
++ - - - - - -
cx > cz > cy
Suppose that M is SP
1. M must be monotone on the mixed point
2. M must ignore the mixed point
3. M is a (randomized) voting rule
Introduction
Motivation
Model
Results
Proof sketch (cont.)
1
3
2
3
x
y
z
--------
cz >
cy >- c- x- ++++
++++++++
++++++++
cx >- -c-z->- c- y- -
++ - - - - - -
4. By Gibbard [‘77], M is a random dictator
5. We construct an instance where random
dictators perform poorly
Introduction
Motivation
Model
Results
Weighted agents
• We must select a dictator randomly
• However, probability may be based on weight
• Naïve approach:
o
Only gives 3-approximation
• An optimal SP algorithm:
o
pr (i)  wi
Matches the lower bound of 3 
wi
pr (i ) 
2(1  wi )
2
n
Introduction
Motivation
Model
Results
Future work
• Other concept classes
• Other loss functions (linear loss, quadratic loss,…)
• Alternative assumptions on structure of data
• Other models of strategic behavior
• …