PowerPoint: Familiar Discrete Distributions.

Download Report

Transcript PowerPoint: Familiar Discrete Distributions.

Some Common Discrete Random Variables

Binomial Random Variables

Binomial experiment

• A sequence of each of which results in either a “success” or a “failure”.

n

trials (

called Bernoulli trials

), • The trials are independent and so the probability of success,

p

, remains the same for each trial.

• Define a random variable Y as the number of successes observed during the

n

trials.

• What is the probability p(

y

), for

y

= 0, 1, …,

n

?

• How many successes may we expect? E(Y) = ?

Returning Students

• Suppose the retention rate for a school indicates the probability a freshman returns for their sophmore year is 0.65. Among 12 randomly selected freshman, what is the probability 8 of them return to school next year?

Each student either returns or doesn’t. Think of each selected student as a trial, so

n

= 12.

If we consider “student returns” to be a success, then

p

= 0.65.

12 trials, 8 successes

• To find the probability of this event, consider the probability for just one sample point in the event.

• For example, the probability the first 8 students return and the last 4 don’t.

• Since independent, we just multiply the probabilities:  ( 1 

R

2   ( 1 ) ( 8  2   ) 4 (

R

8 8 

R

9 ) ( 9 ) 

R

10 

R

11 

R

12 ) ( 12 )

12 trials, 8 successes

• For the probability of this event, we sum the probabilities for each sample point in the event.

• How many sample points are in this event?

• How many ways can 8 successes and 4 failures occur?

12

C C

8 4 4 , or simply

C

12 8 • Each of these sample points has the same probability. • Hence, summing these probabilities yields =

C

12 8 8 (0.65) (0.35) 4  0.237

Binomial Probability Function

• A random variable has a binomial distribution with parameters

n

and

p

if its probability function is given by 

n

C p

y y

(1 

p

)

Rats!

• In a research study, rats are injected with a drug. The probability that a rat will die from the drug before the experiment is over is 0.16. Ten rats are injected with the drug.

What is the probability that at least 8 will survive? Would you be surprised if at least 5 died during the experiment?

Quality Control

• For parts machined by a particular lathe, on average, 95% of the parts are within the acceptable tolerance.

• If 20 parts are checked, what is the probability that at least 18 are acceptable?

• If 20 parts are checked, what is the probability that at most 18 are acceptable?

Binomial Theorem

• As we saw in our Discrete class, the Binomial Theorem allows us to expand (

p

q

)

n

y n

  0

n y y

C p q

• As a result, summing the binomial probabilities, where q = 1- p is the probability of a failure, 

y

y

) 

y n

  0

n C p y y

(1 

p

)  (

p p

))

n

 1

Mean and Variance

• If Y is a binomial random variable with parameters

n

and

p

, the expected value and variance for Y are given by 

n p

n p

(1

p

)

Rats!

• In a research study, rats are injected with a drug. The probability that a rat will die from the drug before the experiment is over is 0.16. Ten rats are injected with the drug.

• How many of the rats are expected to survive?

• Find the variance for the number of survivors.

Geometric Random Variables

Your 1

st

Success

• Similar to the binomial experiment, we consider: • A sequence of • The independent Bernoulli trials.

probability of “success” equals

p

on each trial.

• Define a random variable Y as the number of the trial on which the 1 st success occurs. (

Stop the trials after the first success occurs

.) • What is the probability p(

y

), for

y

= 1,2, … ?

• On which trial is the first success expected?

S = success

• Consider the values of Y: y = 1: (S) y = 2: (F, S) S y = 3: (F, F, S) y = 4: (F, F, F, S) and so on… F p(1) =

p

p(2) = (

q

)(

p

) p(3) = (

q

2 )(

p

) p(4) = (

q

3 )(

p

) (S) S F (F, S) (F, F, S) S S (F, F, F, S) F ….

Geometric Probability Function

• A random variable has a geometric distribution with parameter

p

if its probability function is given by

where

q

q y

 1

p p

, for

y

1, 2,...

Success?

• Of course, you need to be clear on what you consider a “success”. • For example, the 1 st success might mean finding the 1 st defective item!

(D) D (G, D) D G (G, G, D) D G G

Geometric Mean, Variance

• If Y is a geometric random variable with parameter

p

the expected value and variance for Y are given by 

1

p

1

p

2

p

At least ‘

a’

trials? (#3.55)

• For a geometric random variable and

a

> 0, show P(Y >

a

) =

q a

• Consider P(Y >

a

) = 1 – P(Y < a) = 1 – p(1 + q + q 2 + …+ q a-1 ) = q a , based on the sum of a geometric series

“Memoryless Property”

• • For the geometric distribution P(Y > a + b | Y > a ) = q b = P(Y > b)

“at least 5 more trials?”

We note P(Y > 7 | Y > 2 ) = q 5 = P(Y > 5).

That is, “knowing the first two trials were failures, the probability a success won’t occur on the next 5 trials” is identical to… “just starting the trials and a success won’t occur on the first 5 trials”

Negative Binomial Distribution

• Again, considering a independent Bernoulli trials with probability of “success”

p

on each trial… • Instead of watching for the 1 st success, let Y be the number of the trial on which the

r th

success occurs. (

Stop the trials after the r th success occurs

.) • For a given value

r

, the probability p(

y

) is 

C y

1,

r

1

p r

(1 

p

) ,

y

  1,...

Negative Binomial

• To determine the probability the 4 th on the 7 th trial, we compute success occurs

p

(7) 

C p

6,3 4 (1 

p

) 3 • Note this is actually just the binomial probability of 3 successes during the first 6 trials, followed by one more success:

p

(7)  

C p

6,3 3 (1 

p

) 3    “a

success on 4 th last trial

Negative Binomial

• For the negative binomial distribution, we have 

r

r

(1 

p

)

p p

2 • For example, if a success occurs 10% of the time (i.e., p = 0.1), then to find the 4 th success, we expect to require 40 trials

on average

.

 4 0.1

 40

Intuitively, wouldn’t you expect 40 trials?

Poisson Random Variables

Number of occurrences

• Let Y represent the number of occurrences of an event in an interval of size

s

.

• Here we may be referring to an interval of time, distance, space, etc.

• For example, we may be interested in the number of customers Y arriving during a given time interval.

• We call Y a Poisson random variable.

Poisson R. V.

• A random variable has a Poisson distribution with parameter l if its probability function is given by  l

y e

 l

y

!

where

y

= 0, 1, 2, … We’ll see that l is the “average rate” at which the events occur. That is, E(Y) = l .

Queries

• If the number of database queries processed by a computer in a time interval is a Poisson random variable with an average of 6 queries per minute, find the probability that 4 queries occur in a one minute interval.

p

(4)  6 4

e

 6 4!

 0.13385

Fewer Queries

• As before, for the Poisson random variable with an average of 6 queries per minute… • find the probability there are less than 6 queries in a one minute interval:  6)   5)  poissoncdf (6,5)  0.44568

Some PoissonVariables

• Number of incoming telephone calls to a switchboard within a given time interval; • Number of errors (incorrect bits) received by a modem during a given time interval; • Number of chocolate chips in one of Dr. Vestal’s chocolate chip cookies; • Number of claims processed by a particular insurance company on a single day; • Number of white blood cells in a drop of blood; • Number of dead deer along a mile of highway.

Poisson mean, variance

• If Y is a Poisson random variable with parameter l, the expected value and variance for Y are given by  l  l

Hypergeometric Random Variables

Sampling without replacement

• When sampling with replacement, each trial remains independent. For example,… • If balls are replaced, P(red ball on 2 nd draw) = P(red ball on 2 nd draw | first ball was red).

• If balls not replaced, then given the first ball is red, there is less chance of a red ball on the 2 nd draw.

Though for a large population of balls, the effect may be minimal.

n

trials,

y

red balls

• Suppose there are

r

red balls, and

N – r

other balls.

• Consider Y, the number of red balls in

n

selections, where now the trials may be dependent.

(for sampling without replacement, when sample size is significant relative to the population)

• The probability

y

of the

n

selected balls are red is 

r C C y N r n y C n N

Hypergeometric R. V.

• A random variable has a hypergeometric distribution with parameters

N, n,

and

r

if its probability function is given by 

r C C y N r n y C n N

where 0 <

y

< min(

n, r

).

Hypergeometric mean, variance

• If Y is a hypergeometric random variable with parameter

p

the expected value and variance for Y are given by 

n r N

n r N

N

 

N r N N

n

 1

Sample of 20

Suppose among a supply of 5000 parts produced during a given week, there are 100 that don’t meet the required quality standard. Twenty of the parts are randomly selected and checked to see if they meet the standard. Let Y be the number in the sample that don’t meet the standard.

a). Compute the probability exactly 2 of the sampled parts fail to meet the quality standard.

b). Determine the mean, E(Y).