Random Variables - Arizona State University

Download Report

Transcript Random Variables - Arizona State University

Probability Distributions

Random Variables: Finite and Continuous A review

MAT174, Spring 2004

Finite Random Variables

We want to associate probabilities with the values that the random variable takes on.

There are two types of functions that allow us to do this:  Probability Mass Functions (p.m.f)  Cumulative Distribution Functions (c.d.f)

Probability Distributions

 The pattern of probabilities for a random variable is called its probability distribution.

 In the case of a

finite

random variable we call this the

probability mass function (p.m.f.)

,

f x (x) where f x (x) = P( X =

x )

n

  1

i

 all x

f X

x i

f X

Probability Mass Function

   This is a p.m.f which is a histogram representing the probabilities The bars are centered above the values of the random variable The heights of the bars are equal to the corresponding probabilities (when the width of your rectangles is 1)

0.5

0.4

0.3

0.2

0.1

0 0 1 2 P(X=x)

Cumulative Distribution Function

    The same probability information is often given in a different form, called the

cumulative distribution function (c.d.f)

or

F X

F X (x) = P(X ≤ x) 0 ≤ F X (x) ≤ 1, for all x In the finite case, the graph of a c.d.f. should look like a step function, where the maximum is 1 and the minimum is 0.

Cumulative Distribution Function

Cum ulative D istrib ution Function

1.0

0.8

0.6

F X

(x )

0.4

0.2

0.0

0 1 2 3 4 5 6 7

x

8 9 10 11 12 13 14

Binomial Random Variable

 Let X stand for the number of successes in

n

Bernoulli Trials where X is called a Binomial Random Variable  Binomial Setting: 1.

2. 3.

4.

You have

n

repeated trials of an experiment On a single trial, there are only two possible outcomes The probability of success is the same from trial to trial The outcome of each trial is independent  Expected Value of a Binomial R.V is represented by E(X)=n*p

BINOMDIST

 BINOMDIST is a built-in

Excel

function that gives values for the p.m.f and c.d.f of any binomial random variable  It is located under

Statistical

menu in the

Function

– – BINOMDIST(x, n, p, false) = P(X=x) BINOMDIST(x, n, p, true) = P(X ≤ x)

Expected Value

 

E

(

X

)

all

x x

f X

(

x

)

This is average value of X (what happens on average in infinitely many repeated trials of the underlying experiment – It is denoted by  X For a Binomial Random Variable, E(X)=n*p , where

n

is the the number of independent trials and

p

is the probability of success

Continuous Random Variable

   Continuous random variables take on values in an interval; you cannot list all the possible values Examples: 1.

Let X be a randomly selected number between 0 2.

and 1 Let R be a future value of a weekly ratio of closing 3.

prices for IBM stock Let W be the exact weight of a randomly selected student You can only calculate probabilities associated with interval values of X. You cannot calculate P(X=x); however we can still look at its c.d.f, F X (x).

Probability Density Function (p.d.f)

 Represented by

f x (x)

f x (x)

is the height of the function

f x (x)

at an input of x – This function does not give probabilities  For any continuous random variable, X, P(X=a)=0 for every number a.  Look at probabilities associated with X taking on an interval of values – P(

a

≤ X ≤ b)

Probability Density Function (p.d.f)

 To find P(

a

≤ X ≤ b), we need to look at the portion of the graph that corresponds to this interval.

 How can we relate this to integration?

a A f X b

Probability Density Function

A

   

X

X b

) 

b

)

b

).

b

)

Cumulative Distribution Function

  CDF - – – F X (x)=P(X ≤ x) 0 ≤ F X (x) ≤ 1, for all x NOTE: Regardless of whether the random variable is finite or continuous, the cdf, F X , has the same interpretation – I.e., F X (x)=P(X ≤ x)

Cumulative Distribution Function

 For the finite case, our c.d.f graph was a step function  For the continuous case, our c.d.f. graph will be a continuous graph

F T

(

t

)

Cumulative Distribution Function

1.2

1.0

0.8

0.6

0.4

0.2

0.0

-1 0

t

1 2 3

Fundamental Theorem of Calculus (FTC)

  Given that –

G

(

x

)  

g

(

x

)

dx

Differentiate both sides and what happens?

Well, from the previous slide we can see that

F X

(

x

)  

f X

(

x

)

dx

  – If we differentiate both sides, we get that

F X

' (

x

)  What does this say?

f X

(

x

) How can we verify this claim?

Example 7 from Course Files

 Define the following function:

f X

(

x

)      0 7 .

5

x

4  30

x

3  37 elsewhere .

5

x

2  15

x

if 0 

x

 1 – – – – What are the possible values of X?

Set up an integral that would give you the following probabilities:  P(X < 0.5)    P(X > 0.6) P(0.1 ≤ X ≤ 0.9) P(0.1 ≤ X ≤ 5) Verify that the function is a density function What is E(X)?

Expected Value

   For a finite random variable, we summed over all possible values of x For a continuous random variable, we want to integrate over all possible values of x This implies that

E

(

X

)  

X

    

x

f X

(

x

)

dx

Example 8 from the Course Files

 Let T be the amount of time between consecutive computer crashes and has the following p.d.f. and c.d.f.

– What type of r.v. is T?

f T

(

t

)   0   16 1 .

8

e

t

if 16 .

8 if t

t

 0  0  0

F T

(

t

)    0 1 

e

t

16 .

8 if if t

t

 0 – Calculate P(1 < T < 5) in two different ways.

– What is E(X)?

Exponential Distribution

    Exponential random variables consecutive events.

usually describe the waiting time between In general, the p.d.f and c.d.f for an exponential random variable X is given as follows:

f X

(

x

)      0 1  if 

e x

x

/   0 if 0 

x F X

(

x

)     1 0 Any

EXPONENTIAL

random variable X, with parameter  , has  if

e

x x

/   0 if 0 

x

  How can we verify this?

Continuous R.V. with exponential distribution

f X

(

x

) -3

Probability Density Function

0.6

0.5

0.4

0.3

0.2

0.1

0.0

0 3

x

6 9 12 15

F X

(

x

)

Cumulative Distribution Function

1.2

1.0

0.8

0.6

0.4

0.2

0.0

-3 0 3 6

x

9 12 15 • How can we verify that the graph on the left is the graph of a p.d.f.?

Uniform Distribution

   If the probability that X assumes a value is the same for all equal subintervals of an interval [0,u], then we have a continuous uniform random variable X is equally likely to assume any value in [0,u] If X is uniform on the interval [0,u], then we have the following formulas:

f X

(

x

)   0     1

u

0 if if if

x u

 0  

u x x

u F X

(

x

)   0     1

x u

if if if

u x

0   

x u x

u

Continuous R.V. with uniform distribution

• In general, if X is a continuous random variable with a

UNIFORM

distribution on [0,u], then 

u

2

f X

(

x

) 0.0016

0.0012

0.0008

0.0004

0.0000

-100 0

Probbility Density Function

100 200 300 400 500 600 700 800

x F X

(

x

) 1.0

0.8

0.6

0.4

0.2

0.0

Cumulative Distribution Function

-100 0 100 200 300 400 500 600 700 800

x

Focus on the Project

  Look at the file

Auction Focus.xls

in the course files – This file contains 22 prior leases – – Looking at each prior lease, we see that if each company bid their signal, every company that won the auction would have lost money We want to devise a new bidding strategy using this data Use data to simulate thousands of similar auctions

Identify Random Variables

 We need random variables – – – Let

V

be the continuous random variable that gives the fair profit value, in millions of dollars, for an oil lease similar to the 22 tracts  Look through

Auction Focus.xls

sample to see the statistics for the Each signal is an observation of the continuous random variable,

S V

where

v

is the actual fair value of the tract

R

V

It is assumed that

E(S V ) = v

gives the

error

for every lease in a company’s signal  Given by the signal minus the actual fair profit value of the lease 

E(R V ) = 0

for every value of

v

What should you do?

 1.

2.

From slide 65 in MBD 2 Proj 2.ppt – Start an

Excel

file which incorporates the historical data on the lease values and your team’s particular set of signals – Use these to compute the complete sample of signal errors, and then analyze this sample. Specifically, you should compute the maximum, minimum, and sample mean of the errors. You should also plot a histogram that approximates the actual

p.d.f, f R

of R Go to slide 50 to see information about relative frequencies