Transcript Document
Outline
• Historical note about Bayes’ rule • Bayesian updating for probability density functions – Salary offer estimate • Coin trials example • Reading material: – Gelman, Andrew, et al. Bayesian data analysis. CRC press, 2003, Chapter 1.
• Slides based in part on lecture by Prof. Joo-Ho Choi of Korea Aerospace University
Historical Note
• Birth of Bayesian – Rev. Thomas Bayes proposed Bayes’ theory (1763): q of Binomial dist. is estimated using observed data.
Laplace discovered, put his name (1812), generalized to many prob’s.
– For more 100 years, Bayesian “ degree of belief ” was rejected as vague and subjective. Objective “ frequency ” was accepted in statistics.
– Jeffreys (1939) rediscovered, made modern theory (1961).
Until 80s, still limited due to requirement for computation.
• Flourishing of Bayesian – From 1990, rapid advance of HW & SW, made it practical.
– Bayesian technique applied to areas of science (economics, medical) & engineering.
, 1999
Bayesian Probability • What is Bayesian probability ?
– Classical: relative frequency of an event, given many repeated trials (e.g., probability of throwing 10 with pair of dice) – Bayesian: degree of belief that it is true based on evidence at hand
• Saturn mass estimation
– Classical: mass is fixed but unknown.
– Bayesian: mass described probabilistically based on observations (e.g, uniformly in interval (a,b).
Bayes rule for pdf’s
•
θ
is a probability density to estimate based on data
y
.
• Conditional probability density functions
p
q
p
q q • Leading to Bayes’ rule:
p
p
q • Often written as
p
p
q
p
q •
L
used because p(y|
θ )
is called the likelihood function.
• Instead of dividing by p(y) can divide by area under curve.
p
p
q
p
Bayesian updating
– The process schematically
Updated prior PDF
p
q |
y
L
y
| q
Observed data added
q Prior distribution Likelihood function Observed data
y
10 8 6 4 12 2 0 4 6 8 10 12 14 16 Posterior distribution 5 4 3 2 1 Posterior Prior 0 q
post
0.1
0.3
k
y
0.5
x 0.7
0.9
k
q
prior
Salary estimate example
• You are considering an engineering position for which salary offers
θ
(in thousand dollars)have recently followed the triangular distribution
p
q q 90 110 • Your friend received a $93K offer for a similar position, and you know that their range of offers for such positions is no more than $5K.
• Before your friend’s data, what was your chance of an offer <$93K ?
• Estimate the distribution of the expected offer and the likeliest value.
p
q q 0.1
| 93
L
q
p y
q 5 0.1
90 98 Right hand side is 0.008 at q To make area equal to 1:
p
q | 93 q q 0.25
0.2
0.15
0.032
0.1
0.05
0 90 92 94 theta 96 98 100
Self evaluation question • What value of salary offer to your friend would leave you with the least uncertainty about your own expected offer?
Coin Trials Example
• Problem – For a weighted (uneven) coin, probability of heads is to be determined based on the experiments.
– Assume the true
θ
is 0.78, obtained after ∞ trials.
This is the parameter q to be estimated.
But we don’t know this. Only infer based on experiments.
• Bayesian parameter estimation Prior knowledge on p 0 ( q 1.
2.
) No prior information Normal dist centered at 0.5 with s =0.05
3.
Uniform distribution [0.5, 0.7] Posterior distribution of q
p
q |
x
| q
p
0 Experiment data: x times out of n trials. • 4 out of 5 trials • 78 out of 100 trials Likelihood by Binomial dist.
Count of successes Outcome | q
n
Given parameter Count of failures
C x
q
x
1 q Failure probability Success probability
Probability of heads posterior
Prior
distributions
1. No prior (uniform) 7 6 5 4 10 9 8 3 2 1 0 0 0.1
figure
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2. N(0.5,0.05), poor prior slows convergence.
3. U(0.5,0.7) cannot exceed barrier due to incorrect prior 1 10 9 8 4 3 7 6 5 4 3 6 5 2 1 0 0 10 9 8 7 2 1 0 0 0.1
0.2
0.1
0.2
0.3
0.4
0.3
0.4
0.5
0.6
0.5
0.6
0.7
0.8
0.7
0.8
0.9
0.9
1 1 Red: prior Wide: 4 out of 5 Narrow; 78 out of 100
Probability of 5 consecutive heads
• Prediction using posterior (no prior case) • Exact value is binom(5,5,0.78) = 0.78
5 = 0.289
Likelihood by Binomial dist.
Count of successes Outcome | q
n
Given parameter Count of failures
C x
q
x
1 q Failure probability Success probability Posterior distribution of q
p
q |
x
| q 2 1 5 4 3 10 9 8 7 6
p
q |
x
0 0 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Posterior PDF of q 1 Estimation process Draw random samples of q from PDF Compute p based on each q binom(5,5, q ) 18000 16000 14000 12000 10000 8000 6000 4000 2000 0.1
0.2
0.3
0.4
0.5
0.6
0.8
0.9
0 0 0.7
1 10,000 samples of q 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 0
median
0.282
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
5% CI
0.172
1
95% CI
0.416
10,000 samples of predicted p Posterior prediction process
practice problems
1. For the salary estimate problem, what is the probability of getting a better offer than your friend?
2. For the salary problem, calculate the 95% confidence bounds on your salary around the mean and median of your expected salary distribution.
3. Slide 9 shows the risks associated with using a prior. When is it important to use a prior?
Source: Smithsonian Institution Number: 2004-57325