Document

Transcript Document

Outline

• Historical note about Bayes’ rule • Bayesian updating for probability density functions – Salary offer estimate • Coin trials example • Reading material: – Gelman, Andrew, et al. Bayesian data analysis. CRC press, 2003, Chapter 1.

• Slides based in part on lecture by Prof. Joo-Ho Choi of Korea Aerospace University

Historical Note

• Birth of Bayesian – Rev. Thomas Bayes proposed Bayes’ theory (1763): q of Binomial dist. is estimated using observed data.

Laplace discovered, put his name (1812), generalized to many prob’s.

– For more 100 years, Bayesian “ degree of belief ” was rejected as vague and subjective. Objective “ frequency ” was accepted in statistics.

– Jeffreys (1939) rediscovered, made modern theory (1961).

Until 80s, still limited due to requirement for computation.

• Flourishing of Bayesian – From 1990, rapid advance of HW & SW, made it practical.

– Bayesian technique applied to areas of science (economics, medical) & engineering.

, 1999

Bayesian Probability • What is Bayesian probability ?

– Classical: relative frequency of an event, given many repeated trials (e.g., probability of throwing 10 with pair of dice) – Bayesian: degree of belief that it is true based on evidence at hand

• Saturn mass estimation

– Classical: mass is fixed but unknown.

– Bayesian: mass described probabilistically based on observations (e.g, uniformly in interval (a,b).

Bayes rule for pdf’s

•

is a probability density to estimate based on data

• Conditional probability density functions

q 

  q  q • Leading to Bayes’ rule:



q • Often written as



q 

q •

used because p(y|

θ )

is called the likelihood function.

• Instead of dividing by p(y) can divide by area under curve.



q 

Bayesian updating

– The process schematically

Updated prior PDF



q |







| q

Observed data added

q Prior distribution Likelihood function Observed data

10 8 6 4 12 2 0 4 6 8 10 12 14 16 Posterior distribution 5 4 3 2 1 Posterior Prior 0 q

post

0.1

0.3



0.5

x 0.7

 0.9

 q

prior

Salary estimate example

• You are considering an engineering position for which salary offers

(in thousand dollars)have recently followed the triangular distribution

q   q 90 110 • Your friend received a $93K offer for a similar position, and you know that their range of offers for such positions is no more than $5K.

• Before your friend’s data, what was your chance of an offer <$93K ?

• Estimate the distribution of the expected offer and the likeliest value.

 q q  0.1

| 93  

p y

q  5  0.1

 90 98 Right hand side is 0.008 at q  To make area equal to 1:

 q | 93     q   q 0.25

0.2

0.15

 0.032

0.1

0.05

0 90 92 94 theta 96 98 100

Self evaluation question • What value of salary offer to your friend would leave you with the least uncertainty about your own expected offer?

Coin Trials Example

• Problem – For a weighted (uneven) coin, probability of heads is to be determined based on the experiments.

– Assume the true

is 0.78, obtained after ∞ trials.

This is the parameter q to be estimated.

But we don’t know this. Only infer based on experiments.

• Bayesian parameter estimation Prior knowledge on p 0 ( q 1.

) No prior information Normal dist centered at 0.5 with s =0.05

Uniform distribution [0.5, 0.7] Posterior distribution of q

 q |

  | q

0 Experiment data: x times out of n trials. • 4 out of 5 trials • 78 out of 100 trials Likelihood by Binomial dist.

Count of successes Outcome  | q  

Given parameter Count of failures

C x

 1  q  Failure probability Success probability

Probability of heads posterior

Prior

distributions

1. No prior (uniform) 7 6 5 4 10 9 8 3 2 1 0 0 0.1

figure

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

2. N(0.5,0.05), poor prior slows convergence.

3. U(0.5,0.7) cannot exceed barrier due to incorrect prior 1 10 9 8 4 3 7 6 5 4 3 6 5 2 1 0 0 10 9 8 7 2 1 0 0 0.1

0.2

0.1

0.2

0.3

0.4

0.3

0.4

0.5

0.6

0.5

0.6

0.7

0.8

0.7

0.8

0.9

1 1 Red: prior Wide: 4 out of 5 Narrow; 78 out of 100

Probability of 5 consecutive heads

• Prediction using posterior (no prior case) • Exact value is binom(5,5,0.78) = 0.78

5 = 0.289

Likelihood by Binomial dist.

Count of successes Outcome  | q  

Given parameter Count of failures

C x

 1  q  Failure probability Success probability Posterior distribution of q

 q |

  | q 2 1 5 4 3 10 9 8 7 6

 q |

 0 0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Posterior PDF of q 1 Estimation process Draw random samples of q from PDF Compute p based on each q binom(5,5, q ) 18000 16000 14000 12000 10000 8000 6000 4000 2000 0.1

0.2

0.3

0.4

0.5

0.6

0.8

0.9

0 0 0.7

1 10,000 samples of q 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 0

median

0.282

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

5% CI

0.172

95% CI

0.416

10,000 samples of predicted p Posterior prediction process

practice problems

1. For the salary estimate problem, what is the probability of getting a better offer than your friend?

2. For the salary problem, calculate the 95% confidence bounds on your salary around the mean and median of your expected salary distribution.

3. Slide 9 shows the risks associated with using a prior. When is it important to use a prior?

Source: Smithsonian Institution Number: 2004-57325

Document

Transcript Document

Outline

Historical Note

Bayesian Probability • What is Bayesian probability ?

• Saturn mass estimation

Bayes rule for pdf’s

Bayesian updating







Salary estimate example

Self evaluation question • What value of salary offer to your friend would leave you with the least uncertainty about your own expected offer?

Coin Trials Example

Probability of heads posterior

Prior

distributions

figure

Probability of 5 consecutive heads

practice problems

Directory