Last Class (Before Review)

Download Report

Transcript Last Class (Before Review)

Last Class (Before Review)
Last Class!
This is our last class on substantive, new
material (next time is the review).
For the past week and a half, we have been
discussing the basic framework of decision
theory, particularly applied to decisions under
ignorance.
Decision Theory
The goal of decision theory is to devise a rule or
set of rules that tell us what act is appropriate,
given a problem specification and a set of
rational preferences.
There are some obvious principles we should
abide by, even though they don’t help in every
decision– for example, the dominance principle.
Dominance Principle
The dominance principle says that if one act is
better than or equal to every other act in every
state, then you should take that act.
Clearly, however, this doesn’t help with most
decisions, because in most decisions different
acts are better depending on the state.
Example: Dominance Principle
A1**
A2
S1
5
3
S2
7
7
Decision Rules: Maximin
Here’s a brief look at the decision rules we’ve
covered:
1. The Maximin Rule: For each act, find the
worst possible outcome that could result from
that act. Choose the act whose worst possible
outcome is the best of all the acts. Maximize the
minimum outcome.
Example: Maximin
A1**
A2
A3
S1
1*
-1*
0*
S2
14
17
20
S3
13
11
6
Decision Rules: Minimax Regret
2. Minimax Regret: For each act, calculate the
amount of “missed opportunity” in each of the
states. That is, how much does the outcome for
that act in that state fall short of the best
possible outcome for that act in that state?
That’s how much you’d regret that act, if that
state obtained. Find the maximum amount of
regret for each act, then choose the act with the
smallest maximum amount of regret.
Example: Minimax Regret
A1
A2**
A3
S1
1
-1
0
S2
14
17
20
S3
13
11
6
Example: Minimax Regret
A1
A2**
A3
S1
0
2
1
S2
6*
3*
0
S3
0
2
7*
Decision Rules: Optimism-Pessimism
Rule
3. Optimisim-Pessimism Rule: For each act, find
the best possible outcome, and the worst
possible outcome. Figure out how much you
care about obtaining what’s best and avoiding
what’s worst (figure out your optimism index).
Then choose the act with the best weighted
average of best and worst outcomes.
Example: Optimism-Pessimism Rule
A1
A2
A3**
S1
1
-1
0
S2
14
17
20
S3
13
11
6
Assume Optimism Index of 50%
Here are the OPN’s for the acts given O = 50%.
OPN for A1 = 0.5 x 1 + 0.5 x 13 = 7
OPN for A2 = 0.5 x -1 + 0.5 x 17 = 8
OPN for A3** = 0.5 x 0 + 0.5 x 20 = 10
Ignorance vs. Risk
These rules are all possible rules for how to
make decisions under ignorance (where we
don’t know how probable all of the states are).
When we have a decision under risk, instead,
there is only one rule that decision theorists
take seriously: maximize expected utility.
Utilities vs. Values
As a way of illustrating, I am going to replace
utilities with dollar amounts, and the rule
“maximize expected utility” with “maximize
expected value” (dollar value).
This rule says: take the action with the greatest
expected dollar value.
Expected Values
Suppose I’m going to flip a fair coin twice, and pay
you the following amounts for the following
outcomes:
• HH: $20
• HT: $8
• TH: $4
• TT: $1
How much money do you expect to win? How
much would you pay to play this game?
Expected Values
Here’s what we know: each of the outcomes
(HH, HT, TH, and TT) is equally probable at a 25%
chance of happening. The expected value of this
game is the sum of the probabilities of each
outcome multiplied by the values of those
outcomes:
P(HH)x$20 + P(HT)x$8 + P(TH)x$4 + P(TT)x$1
= $5 + $2 + $1 + $0.25 = $8.25
Expected Values
According to decision theory (if your utilities are
linear in dollars) you should always pay less than
$8.25 to play this game, be indifferent to paying
$8.25 exactly and playing this game, and never
pay more than $8.25 to play. This game is worth
is expected value: paying less than the EV to
play is a bargain, paying more than the EV is like
paying $2 to get $1. It’s irrational.
Reasons for Valuing at the EV
Now there are lots of complicated reasons for
believing that acts and games are worth their
expected values (or better: expected utilities).
We can’t go into all of these reason here.
Here’s one of the reasons: the law of large
numbers says that if you play this game a large
number of times, your average payout per game
will be $8.25.
Decisions Under Risk
How does this involve decision theory?
Well, for decisions under risk, decision theory
says: calculate the expected value (utility) for
each act. Then take the act with the highest
expected value (utility). Maximize expected
value (utility).
Example: Maximizing EV
Suppose that you work for a medical insurance
company. Everyone who applies for insurance
must fill out a complicated medical history
questionnaire. Each policy lasts for 1 year and
has a premium of $1,000. If someone dies
during that year, they receive $250,000.
Example: Maximizing EV
Now suppose that I come into your office and
apply for insurance. I fill out the medical
questionnaire, and your statisticians determine
that I have a 5% chance of dying within the next
year. Should you insure me? (Remember this
question means: should you rationally insure
me; it might be moral to insure me even if it’s
not rational.)
Decision Table
Michael Dies Michael Lives
Insure Michael -$250000
+$1000
0.05
0.95
Don’t Insure
$0
$0
Michael
0.05
0.95
Maximize EV
According to the rule “maximize expected
value,” we should calculate the expected value
of each act and take the one with the highest EV.
EV don’t insure = $0 x 5% + $0 x 95% = $0
EV do insure
= -$250000 x 5% + $1000 x 95% = -$11,500
Utility vs. Value
Decision theorists prefer to talk in terms of
utility rather than (monetary) value. ‘Utility’ is
just a special name for non-monetary value.
How much something is “really worth” not in
dollars, but in terms of personal satisfaction to
you. You might think that this can be measured
by how much you would be willing to pay– and
sometimes it is– but sometimes it isn’t.
Utility < EV
Suppose you’ve been saving up to put a down
payment on a flat– your dream flat, the one you
plan on living in for the rest of your life. A down
payment is HKD$150,000. Yesterday, you just
finished saving enough money, and tomorrow,
you plan on purchasing the flat. Then your stock
broker calls you with a hot tip: an 80% at a
$500,000 return for a $150,000 investment. But
a 20% chance of losing everything.
Utility < EV
The EV of the investment is:
$500,000 x 80% - $150,000 x 20% = $370,000
Clearly that’s worth paying a measly $150,000
for! But it’s rational not to accept the deal. You
are right now certain to be able to purchase the
home of your dreams. If you gamble here,
there’s a 20% chance you’ll never get it.
Utility > EV
Suppose you really want to go see a once-in-alifetime sporting match, and tickets are only
$200. However, you only have $100.
A suspicious man comes up to you on the street.
He offers you a gamble: roll two dice, and if you
roll two 1’s, you get $200, otherwise, you pay
him $100. The EV is -$75, but you might
rationally take the bet, because otherwise you
have no chance of seeing the game!
Principle of Insufficient Reason
Can we use the rule “maximize expected value/
utility” be used to solve decisions under
ignorance?
It seems not: to calculate expected values/
utilities, you need to assign probabilities to the
different states. But the defining feature of
decisions under ignorance is that you cannot
assign probabilities to the states.
Principle of Insufficient Reason
However, according to the principle of
insufficient reason, since you have no reason to
assign any particular probability to any state,
you should assign each state the same
probability. Then you should calculate the
expected value of each act, and choose the act
with the highest expected value.
Example from Last Time
A1
A2
S1
$0
S2
$99
$100 $0
S3
$99
S4
$99
S5
$99
S6
$99
S7
$99
S8
$99
S9
$99
$0
$0
$0
$0
$0
$0
$0
Maximin Gets the Answer Wrong
In this example from last time, the intuitive
correct act is A1.
The Maximin Rule says to pick A2. The worst
possible outcome for both A1 and A2 is 0. So for
a tie-breaker we consider the second-worst
possible outcome, which is $99 in A1 and $100
in A2. So we maximize and choose A2.
Minimax Regret Wrong Too
The minimax regret principle gets the answer
wrong too. A1 has a maximum regret of $100 (if
state S1 obtains) whereas A2 has a maximum
regret of $99. If we minimize the maximum
regret, we pick A2 again.
Optimism-Pessimism Rule Wrong
The optimism-pessimism rule also gets the
wrong answer. Remember that this rule says to
compare a weighted average of the best and
worst outcomes of each action. But the worst
outcomes are both $0 for A1 and A2, so the
optimism-pessimism rule just says: pick the one
with the best best outcome, and that’s A2 again.
Maximize Expected Value
However, if we assign each state equal
probability (1/9) under the principle of
insufficient reason, and calculate expected
values, we get:
EV A1 = $0 x (1/9) + $99 x (8/9) = $88
EV A2 = $100 x (1/9) + $0 x (8/9) = $11.11
Is the PIR Correct?
The biggest objection to the principle of
insufficient reason is that it is based on a faulty
assumption. Just because we don’t have enough
information to assign probabilities to the states
does not mean that we should assign them
equal probability– that’s as unjustified as
assigning them any other probabilities.
Example
For example, it’s hard to assign probabilities to
the states “in the next 30 years there is a
nuclear holocaust that brings about an
apocalyptic future” and “things are pretty
normal 30 years from now.”
Surely that doesn’t mean we should treat these
as equally likely, and spend half our money
preparing for nuclear winter and half for
retirement.
Disaster
Additionally, the principle of insufficient reason
may lead us to a disaster. For example, consider
this problem from last time…
Gambling with the Future
A1: Invest your life
savings in a promising,
but unproven start-up
company.
S1: The start-up is
wildly successful.
S2: The start-up fails
when Google engineers
find a way to do
everything it does, but
better.
You make hundreds of
millions of dollars.
You lose your life
savings.
A2: Play it safe, and
You pay for your
invest a conservative
retirement.
stock portfolio with a
modest, but guaranteed
payout.
You pay for your
retirement.
Expected Values
If we assume the two states are equally likely
(50%), and that your life savings is $150,000 and
the possible payout for the investment is $1.5
million, we get:
EV for A1: ½ x -.15M + ½ x 1.5M = $675,000
EV for A2: ½ x .15M + ½ x .15M = $150,000
JUSTICE
Just Society
I want to finish with an application of decision
theory to philosophical views of social justice.
We as public policy makers, voters, or citizen
activists make choices to affect the nature and
structure of the societies we live in. Sometimes
we are motivated by self-interest, but often our
goal is a fair and just society.
Hong Kong
Different societies are obviously different. In
Hong Kong, an estimated 100,000 people (1.5%
of the population) live in “inadequate housing”
(cage homes, rooftops, subdivided spaces), 1.15
million (16.5%-- and 33% of the elderly) live in
poverty– less than HKD$13,350/mo. for a family
of four. But then, the wealthiest are very
wealthy: the top 10 per cent of earners have
40% of the wealth, for Asia’s largest Gini score.
Cage Homes
Luxury Home
Bedroom
Dining Room
Kitchen
Living Area
Denmark
Compare Denmark, which spends a lot of time
taking care of the least well off: the highest
minimum wage in the world, high
unemployment payments, the lowest Gini
coefficient in the world, and high taxes on the
highest earners (45-55% on people making more
than HKD$1 million). The percentage of USD
millionaires in HK is about 5 times more than
that of Denmark (8.6% to 1.7%).
John Rawls
John Rawls is a social philosopher whose most
famous work is A Theory of Justice. Rawls argues
there that one society is more just than another
on the basis of the “difference principle”: x is
more just than y = x treats the least well-off
members of its society better than y treats the
least well-off members of its society. ‘Least welloff’ here means poor, but also disabled, or
elderly, or mentally handicapped.
HK vs. Denmark
So, according to Rawls, when evaluating Hong
Kong and Denmark, we’d ignore the rich people,
and focus on the least well-off people. And since
government policies and high taxes on the rich
provide a strong social safety net for the poor
(unemployment insurance, high minimum
wages) in Denmark, whereas so many people in
Hong Kong live in terrible conditions, Rawls
would say that Denmark is a more just society
than Hong Kong.
John Harsanyi
John Harsanyi is a decision theorist and a
utilitarian, someone who believes that the right
action is the one that brings about the greatest
average happiness. A just society for Harsanyi
would be one that had the greatest average
happiness. If we assume for a moment that
happiness can be measured in money, we find
that Hong Kong is more just than Denmark: #7 in
GDP per capita vs. #14.
This is a critical thinking class and not a social
philosophy class, so we won’t try to answer this
question (“which is more just?”).
But it is interesting to look at how both authors
argue for their views: using different decision
rules that we have discussed.
The Veil of Ignorance
Rawls introduced the idea of the “veil of
ignorance.” The goal is to come up with the
principles or laws that will govern a just society.
You are supposed to imagine that you are in
charge of writing all the laws for this new
society. Importantly, you will be a member of
this new society. But, you do not know who you
will be: man or woman, sick or healthy, rich or
poor, smart or dumb, etc.
Self-Interest
Rawls wants you to choose purely on the basis
of self-interest. He thinks that even if you act
completely selfishly behind the veil of ignorance,
all of your principles will be just. This is because
you will never accept a principle like “put the
poor in zoos and make them dance for our
amusement” because for all you know, in this
new society, you might be one of the poor in the
zoo made to dance.
Decision Theory
This is where decision theory comes in. Decision
theory is all about how to make decisions in
your own self-interest. We consider different
acts (“enact this law,” “enact that law”),
different states (“I am rich, healthy, and smart,”
“I am poor, sick, and dumb”), and the various
outcomes that result (“I live in a cage home,” “I
live in a decent-sized, government provided flat
with food and air-conditioning”).
Maximin
Rawls thinks that everyone will decide on the
basis of the maximin principle. They will
consider the people who would be hurt most by
each law, and choose the law that would result
in the best outcomes for those hurt most–
because they might turn out to be one of these
people who are most hurt. You’ll want a world
where the disabled are taken care of at an
expense to the abled, because you might be
disabled, and you would need that.
Harsanyi: Maximize Expected Utility
Harsanyi accepts the basic set-up involving the
veil of ignorance, but he denies that people
would choose laws based on the maximin
principle. Harsanyi maintains that we should
decide based on the principle: maximize
expected utility under the principle of
insufficient reason: assume that I’m equally
likely to be any person in the society.
Example
For example, suppose we could set up society of
1,000 people where 100 did all the work, and
900 just had fun and did whatever they liked.
Under the principle of insufficient reason, you
have a 100/1,000 = 10% chance of being a
worker and a 900/1,000 = 90% chance of being a
fun-haver.
Expected Utility
And suppose the workers only get 1 unit of
happiness (utility) while the fun-havers all get 90
units. Then the expected utility of choosing this
society is:
EU: 10% x 1 + 90% x 9 = 81.1 utils
Unequal but Just
Compare this to another society where
everyone works a little on a rotating schedule,
and averages 56 utils. Harsanyi would say that
you should choose the unequal society, because
that has the greatest expected utility. And he
would say that that society is just, because what
it is for a society to be just is for people to
selected it for self-interested reasons from
behind the veil of ignorance.
Lots at Stake
We can’t decide the case here, but there are
two things that recommend Rawls’ solution:
First, as Rawls points out, there is a lot at stake.
If you choose laws that allow people to live in
cage homes with cockroaches and barely
enough food to eat, you might wind up being
one of those people. Would you really choose
this society just b/c the average wealth is great?
Happiness is Cheap
Second, as Rawls also points out, you don’t need
lots of money to be happy. This means that we
won’t run into lots of regret by choosing to help
the least well-off. If I wind up rich in Denmark, I
won’t say, “oh no! I could have been so much
more rich if I had chosen the laws of Hong Kong
for my society!” Rich is good enough. But if I
wind up poor, and I haven’t chosen to help the
poorest, I will regret that a lot.
Do You Need $ for Happiness?
This second consideration is questionable.
According to a survey by Royal Skandia (an
investment firm), Hong Kong ranked third (after
Dubai and Singapore) in the amount of money
people said they would need to be happy. This
was higher than countries in Europe, for
example (HKers need HKD$1.5m/ year vs. $0.67
m/ year in Germany). However, the surveyors
speculated that this was because of the lack of a
social safety net in the East.
SUMMARY
Summary
Decisions are hard to make even when we’re
certain of the outcomes. When we’re not
certain, but can assign definite probabilities,
decision theory recommends that we maximize
the expected utility of our actions. With
decisions under ignorance, there are lots of
different rules to consider, but considerations
(like the possibility of regret, or the severity of
bad choices) can recommend certain rules for
certain decisions.