PowerPoint Template

Download Report

Transcript PowerPoint Template

Chapter 4: Joint and Conditional
Distributions
Yang Zhenlin
[email protected]
http://www.mysmu.edu/faculty/zlyang/
Chapter Contents
Chapter 4
Joint Distribution
Special Joint Distributions:
Multinomial and Bivariate Normal
Covariance and Correlation Coefficient
Conditional Distribution
Conditional Expectation
Conditional Variance
STAT151,Term
TermII,II09/10
14-15
STAT306,
2
© Zhenlin Yang, SMU
Introduction
Chapter 4
In many applications, more than one variables are needed for
describing a quantity or a phenomenon of interest, e.g.,
To describe the size of a man, one needs at least height (X)
and weight (Y).
To describe a point in a rectangle, one needs X coordinate and
Y coordinate.
In general, the set of k r.v.s correspond to the same “unit”,
defined on the same sample space and taking values in a kdimensional Euclidean space.
In this chapter, we focus mainly on the case of two r.v.s, and deal
separately with two cases:
• both X and Y are discrete
• both X and Y are continuous
STAT151,Term
TermII,II09/10
14-15
STAT306,
3
© Zhenlin Yang, SMU
Joint Distributions
Chapter 4
Definition 4.1. (Joint CDF) The joint cumulative distribution
function of r.v.s X and Y is the function defined by
F(x, y) = P(X  x, Y  y ).
Definition 4.1 extends naturally to cases of more than two r.v.s.
It applies to both discrete and continuous r.v.s.
Definition 4.2. Let X and Y be two discrete random variables
defined on the same sample space. The joint probability mass
function of X and Y is defined to be
p(x, y) = P(X = x, Y = y)
for all possible values of X and Y.
Definition 4.2 extends directly to cases of more than two r.v.s
STAT151,Term
TermII,II09/10
14-15
STAT306,
4
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions
Example 4.1. Xavier and Yvette are two real estate agents. Let
X and Y denote the number of houses that Xavier and Yvette will
sell next week, respectively. Suppose that there only four houses
for sale next week. The joint probability mass function
and its graph are presented below.
Find P(X  1, Y  1) and P(Y  1).
0.42
p(x,y)
p(x, y)
X
Y
0
1
2
0
.12
.42
.06
1
.21
.06
.03
2
.07
.02
.01
Answer:
P(X  1, Y  1) =.06+.03+.02+.01 = 0.12,
P(Y  1) =.21+.06+.03+.07+.02+.01 = .40
STAT151,Term
TermII,II09/10
14-15
STAT306,
5
0.21
0.12
0.06
0.06
0.07
0.02
0.01
Y
X=0
y=0
X
0.03
y=1
y=2
X=1
X=2
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions
Example 4.2. A bin contains 1000 flower seeds, of which 400 are red, 400 are
white and 200 are pink. Ten seeds are selected at random without replacement.
Let X be the number of red flower seeds and Y be the number of white flower
seeds being selected.
(a) Find the joint pmf of X and Y.
(b) Calculate P(X = 2, Y = 3) and P(X = Y).
Solution: (a) From the counting techniques in Chapter 1, we obtain
 400 400 200  1000
p( x, y)  


 
 , x  0, y  0, x  y  10.
x
y
10

x

y
10



 

(b)
5
P( X  Y )   P( X  i, Y  i )
P( X  2, Y  3)
i 0
 400 400 200 1000



 

2
3
5
10



 

 0.0081
STAT151,Term
TermII,II09/10
14-15
STAT306,
 400 400 200  1000
 


 

i
i
10

2
i
10


 

i 0 
 0.0263
5
6
© Zhenlin Yang, SMU
Joint Distributions
Chapter 4
A function p(x, y) is said to be the joint pmf of discrete r.v.s X and
Y if and only if for all possible values (x, y),
 p( x, y)  1.
(i) p(x, y)  0 and (ii)
x
y
Example 4.3. Let the joint pmf of X and Y be give by
k ( x 2  y 2 ),
p( x, y)  
0,
if ( x, y)  (1,1),(1,2),(2,3),(3,3)
otherwise.
(a) Find the value of the constant k.
(b) Calculate P(X > Y), P(X + Y  4), and P(Y  X).
Solution:
(a) 1   p( x, y) 
x
y
 p( x, y)
( x, y )
= k[(12 + 12) + (12 + 22 ) + (22 + 32 ) + (32 + 32)] = 38 k,  k = 1/38.
(b) P(X > Y) = 0, P(X + Y  4) = 7/38, and P(Y  X) = 1.
STAT151,Term
TermII,II09/10
14-15
STAT306,
7
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions
Definition 4.3. A function f(x, y) is said to be the joint
probability density function of the continuous r.v.s X and Y if
the joint CDF of X and Y can be written as
x
F ( x, y ) 
y
  f (u, v)dvdu, for all x and y.
 
A function f(x, y) is said to be the joint pdf of continuous r.v.s X
and Y if and only if for all possible values (x, y),
(i) f(x, y)  0

 
and (ii)

 
f ( x, y)dxdy  1
 Marginal pmf: pX ( x)   p( x, y); pY ( y)   p( x, y)
y
 Marginal pdf:
STAT151,Term
TermII,II09/10
14-15
STAT306,
x




f X ( x)   f ( x, y)dy; fY ( y)   f ( x, y)dx
8
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions
In Example 4.3, the marginal pmfs of X and Y are given below:
X
p X (x)
1
2
3
7/38
13/38
18/38
Y
pY ( y)
1
2
3
2/38
5/38
31/38
Example 4.4. Let the joint pdf be given by
k xy2
f ( x, y)  
0
if 0  x  y  1
otherwise.
(a) Find the value of the constant k.
(b) Find the marginal pdfs of X and Y.
(c) Calculate P(X + Y < 1), P(2X < Y), and P(X = Y).
Solution: Some points to note:
Finding the constant k and probabilities are matters of double integration,
It is important to draw regions on which integrations are desired, so that the
integration limits can be determined.
STAT151,Term
TermII,II09/10
14-15
STAT306,
9
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions


  f ( x, y)dxdy
=   k xy dxdy
= k   x  y dy
(a) 1 =
 
1 y
2 y
0
1
0 2
1
1
0 2
(b)
1
2
0 0
1
= k
Y
0xy1
2
k
,  k = 10.
10
y 4 dy =
1
0
X
The marginal pdfs are

f X ( x)   f ( x, y)dy =


f Y ( y)   f ( x, y)dx =

(c) P(X + Y < 1) =

0.5 1 x
 
0
x
1
2
10
xy
dy =

x

y
0
10
x(1  x 3 ) , 0  x  1.
3
10xy 2 dx = 5y4, 0  y  1.
Y
2
10xy dy dx
1

10 0.5
3
3
x
(
1

x
)

x
dx

0
3
10 0.5
( x  3 x 2  3 x 3  2 x 4 )dx = 0.1146
=

3 0
=
STAT151,Term
TermII,II09/10
14-15
STAT306,
10
x  y,
x+y  1
0
X=Y
X+Y = 1
1
X
© Zhenlin Yang, SMU
Joint Distributions
1 x

Chapter 4
2
 10xy dy dx
0 x
0.5 1
 
10
x(1  8 x
=

3
P(2X < Y) =
0
2x
Y
10xy 2 dy dx
0.5
0
3
1
2x  y
)dx
2X=Y
X=Y
= 1/4
0
Finally, P(X = Y) =
1
X
1 x

2
10
xy
dy dx = 0.

0 x
Definition 4.4. Two random variables X and Y are said to be
independent if and only if
P(X  x, Y  y) = P(X  x) P(Y  y)
for all possible values (x, y) of (X, Y).
STAT151,Term
TermII,II09/10
14-15
STAT306,
11
© Zhenlin Yang, SMU
Joint Distributions
Chapter 4
Note:
This definition states that X and Y are independent if and only
if their joint CDF can be written as the product of their
marginal CDfs, i.e., F(x, y) = FX(x) FY(y).
When X and Y are both discrete, the independence condition
can be written as P(X = x, Y = y) = P(X = x) P(Y = y), for all x
and y, i.e., the joint pmf is the product of the marginal pmfs.
When X and Y are both continuous, the independence
condition can be written as f(x, y) = fX(x) fY(y), i.e., joint pdf is
the product of the marginal pdfs.
Definition 4.4 extends naturally to the cases of many random
variables
STAT151,Term
TermII,II09/10
14-15
STAT306,
12
© Zhenlin Yang, SMU
Joint Distributions
Chapter 4
Example 4.5. Stores A and B, which belong to the same owner, are located in
two different towns. If the probability density function of the weekly profit of
each store, in thousand dollars, is given by
x 4 if 1  x  3
f ( x)  
0
otherwise.

and the profit of one store is independent of the other, what is the probability
that next week one store makes at least $500 more than the other store?
Solution: Let X and Y denote, respectively, next week’s profits of stores A and
B. The desired probability is
P(X  Y + 1/2) + P(Y  X + 1/2)
Since X and Y are independent, by symmetry,
P(X  Y + 1/2) + P(Y  X + 1/2) = 2 P(X  Y + 1/2)
To calculate P(X  Y + 1/2), we need the joint pdf of X and Y. Since X and Y
are independent, we have,
 xy 16, if 1  x  3, 1  y  3
f ( x, y)  f X ( x) fY ( y)  
otherwise
0,
STAT151,Term
TermII,II09/10
14-15
STAT306,
13
© Zhenlin Yang, SMU
Chapter 4
Joint Distributions
To find P(X > Y + 1/2), one needs to integrate f(x, y) on a region defined by the
conditions: 1  X  3, 1  Y  3, and X  Y + 1/2.
2P(X  Y +1/2)
3  x 1 2 xy

= 2  
dy dx
32 1
16 

1 3
2 x 1 2
xy
dx
=
1

3
2
16
1 3 3
2
(
x

x
 3 x 4)dx = 0.54
=

3
2
16
Y 3
5/2
X  Y+1/2,
1 X  3
1 Y 3
2
 
1
0
Example 4.6. Prove that the two random
variables X and Y with the following joint
probability density function are not
independent.
1
3/2 2
3
X
Solution:
1

fY(y) =  8 xy dx = 4y3,
fX(x) =
8xydy = 4x(1x2), 0  x  1,
x
y
0
0  y  1.
Since f(x, y)  fX(x) fY(y) , X and Y are
8 xy if 0  x  y  1
f ( x, y )  
0
otherwise.

STAT151,Term
TermII,II09/10
14-15
STAT306,
X =Y+1/2
NOT independent.
14
© Zhenlin Yang, SMU
Special Joint Distributions
Chapter 4
Certain special joint distributions such as multinomial and
bivariate normal deserve some detailed attention.
Multinomial is a direct generalization of the binomial. An
experiment has k possible outcomes with probabilities 1, 2,  ,
k. Let Xi be the number of times that the ith outcome occurs
among a total of n independent trials of such an experiment, i = 1,
2,  , k. Then the joint distribution of X1, X2, . . . , Xk is called
the Multinomial Distribution with the joint pmf of the following
form:
n!
p( x1 , x2 ,, xk ) 
 1x1 2x2  kxk
x1! x2! xk !
where 1 + 2 + . . . + k = 1, and x1 + x2 + ... + xk = n.
STAT151,Term
TermII,II09/10
14-15
STAT306,
15
© Zhenlin Yang, SMU
Chapter 4
Special Joint Distributions
A Bivariate Normal distribution has the following joint pdf:
f(x1,x2) =
2
2






 x1  1  x2  2  
1
1
x1  1
x2  2












exp


2

2
    

   
 
2
2
(
1


)

2 1   1 2
1
2
1
2
 



 



Plots of Bivariate Normal pdf
µ1 = µ2 = 0, 1 = 2 =1,  = 0.1
µ1 = µ2 = 0, 1 = 2 =1,  = 0.9
0.15
2
0.1
0.05
1
0
-2
0
-1
-1
0
1
2
STAT151,Term
TermII,II09/10
14-15
STAT306,
16
-2
© Zhenlin Yang, SMU
Special Joint Distributions
Chapter 4
It can be shown that  is the correlation coefficient between X1
and X2. When  =0, we have
2
2


1
 1  x1  1   x2  2   
  
  
f ( x1 , x2 ) 
exp 
21 2
 2   1    2   
2
2



1
1
 1  x1  1  
 1  x2  2  
  
 

exp 
exp 
2  1
 2   1   2  2
 2   2  
 f ( x1 ) f ( x2 )
So, in this case, X1 and X2 are independent.
For two normal random variables, if they are uncorrelated (or
covariance is zero), then they are independent. This conclusion
may not apply to other random variables.
STAT151,Term
TermII,II09/10
14-15
STAT306,
17
© Zhenlin Yang, SMU
Covariance and Correlation Coefficient
Chapter 4
Definition 4.5. The covariance between any two jointly
distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by
Cov(X, Y) = E[(X µX)(Y µY)] = E[XY]  µX µY
where µX = E[X] and µY = E[Y]
Properties of Covariance:
For any two r.v.s X and Y, and constants a, b, c and d,
Cov(X, X) = Var(X)
Cov(X, Y) = Cov(Y, X)
Cov(aX+b, cY+d) = ac Cov(X, Y)
If X and Y are independent then Cov(X, Y) = 0.
STAT151,Term
TermII,II09/10
14-15
STAT306,
18
© Zhenlin Yang, SMU
Covariance and Correlation Coefficient
Chapter 4
Definition 4.6. The correlation coefficient between any two
jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by
Cov( X , Y )
 ( X ,Y ) 
Var( X ) Var(Y )
It measures the degree of association between X and Y, and takes
values in [1, 1].
Properties of Correlation Coefficient:
For any two r.v.s X and Y, and constants a, b, c and d,
(aX+b, cY+d) = (X, Y), if ac > 0,
=  (X, Y), if ac < 0.
STAT151,Term
TermII,II09/10
14-15
STAT306,
19
© Zhenlin Yang, SMU
Conditional Distributions
Chapter 4
One of the most useful concepts in probability theory is that
of conditional probability and conditional expectation, because
In practice, some partial information is often available, and
hence calculations of probabilities and expectations should be
conditional upon the given information;
In calculating a desired probability or expectation it is often
extremely useful to first “condition” on some appropriate
random variables.
The concept of conditional probability, P(A|B) = P(A  B)/P(B),
can be extended directly to give a definition of the conditional
distribution of X given Y = y, where X and Y are two r.v.s,
discrete or continuous.
STAT151,Term
TermII,II09/10
14-15
STAT306,
20
© Zhenlin Yang, SMU
Chapter 4
Conditional Distributions
Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf
of X given Y = y is
p X |Y ( x | y )  P( X  x | Y  y )
P ( X  x, Y  y )
P(Y  y )
p ( x, y )

,
where pY(y)  0;
pY ( y )

The conditional expectation of X given Y = y is defined as
E[ X | Y  y]   x pX |Y ( x | y),
x
Clearly, when X is independent of Y, pX|Y(x | y) = pX(x) .
STAT151,Term
TermII,II09/10
14-15
STAT306,
21
© Zhenlin Yang, SMU
Chapter 4
Conditional Distributions
Definition 4.8. For two continuous r.v.s X and Y, the conditional
pdf of X given Y = y is
f ( x, y )
f X |Y ( x | y ) 
,
fY ( y )
where fY(y)  0;
The conditional expectation of X given Y = y is defined as

E[ X | Y  y]   x f X |Y ( x | y)dx

Example 4.7. Roll a fair die successfully. Let X be the number of
rolls until first 4 and Y be the number of rolls until first 5.
(a) Find the conditional pmf of X given Y = 4.
(b) Calculate P(X > 2 | Y = 4).
(c) Calculate E[X | Y = 4]
STAT151,Term
TermII,II09/10
14-15
STAT306,
22
© Zhenlin Yang, SMU
Chapter 4
Conditional Distributions
Solution:
p( x, 4)
(a) p X |Y ( x | 4) =
, where
pY (4)
pY (4) = P(Y = 4) = (5/6)3(1/6) = 125/1296
p(1, 4) = P(X = 1, Y = 4) = (1/6)(5/6)2(1/6) = 25/1296
p(2, 4) = P(X = 2, Y = 4) = (4/6)(1/6)(5/6)(1/6) = 20/1296
p(3, 4) = P(X = 3, Y = 4) = (4/6)2(1/6)(1/6) = 16/1296
p(4, 4) = P(X = 4, Y = 4) = 0, and
 4
p(x, 4) =  
6
3
 1  5 
  
 6  6 
x 5
1
  , for x = 5, 6, 7, . . . .
6
Therefore,
p X |Y (1 | 4) = p(1,4)/pY(4) = 25/125,
p X |Y (2 | 4) = p(2,4)/pY(4) = 20/125,
STAT151,Term
TermII,II09/10
14-15
STAT306,
23
© Zhenlin Yang, SMU
Chapter 4
Conditional Distributions
p X |Y (3 | 4) = p(3, 4)/pY(4) = 16/125,
p X |Y (4 | 4) = p(4, 4)/pY(4) = 0,
 4
p X |Y ( x | 4) =  
5
3
5
 
6
x 5
and
1
  , for x = 5, 6, 7, . . . .
6
(b)
P(X > 2 | Y = 4) = 1  P(X = 1 | Y = 4)  P(X = 2 | Y = 4)
= 80/125.
(c)
E(X | Y = 4)
4
= (1)(25/125) + (2)(20/125) + (3)(16/125) + (4)(0) +  
5
3
113  4  
5
=
    ( y  4) 
125  5  y 1
6
(in the above, y = x  4)
STAT151,Term
TermII,II09/10
14-15
STAT306,
y 1
3 
5
x
 

x 5  6 
x 5
1
 
6
3
 1  113  4 
   (6+4) = 6.024.
 =
 6  125  5 
24
© Zhenlin Yang, SMU
Conditional Distributions
Chapter 4
Example 4.8. The joint pdf of X and Y is given by
125 x(2  x  y) if 0  x  1, 0  y  1
f ( x, y)  
otherwise.
0
(a) Compute the conditional pdf of X given Y=y, where 0  y  1.
(b) Calculate P(X > 0.5 | Y = 0.5) .
(c) Calculate E(X | Y = 0.5).
Solution:
(a) f X |Y ( x | y) =
=
f ( x, y )
f Y ( y)
f ( x, y )



f ( x, y )dx
=
x(2  x  y )
1
 x(2  x  y)dx
=
6 x(2  x  y )
, for 0  x  1, 0  y  1.
4  3y
0
(b) P(X > 0.5 | Y = 0.5)
=
1

0.5
f X |Y ( x | 0.5) dx =
STAT151,Term
TermII,II09/10
14-15
STAT306,
6
0.5 5 x(3  2 x)dx = 0.65.
1
25
© Zhenlin Yang, SMU
Chapter 4
Conditional Distributions
(c)
f X |Y ( x | 0.5) =
6
x (3  2 x) ,
5
1
E(X | Y = 0.5) =  xf X Y ( x 0.5) dx =
0
6 2
0 5 x (3  2 x)dx = 0.6.
1
Definition 4.9. Let X and Y be jointly distributed r.v.s. The
conditional variance of X given Y = y, is given by,
Var(X | Y = y) = E[(X  µX|Y)2 | Y = y],
where µX|Y = E(X | Y = y).
Continuing on Example 4.8, now we want to find Var(X | Y = 0.5):
1
16
2
2
E(X | Y = 0.5) =  x f X Y ( x 0.5) dx =  x 3 (3  2 x)dx = 0.42
0 5
0
Var(X | Y = 0.5) = E(X2 | Y = 0.5)  [E(X | Y = 0.5)]2
= 0.42 – 0.62 = 0.06
STAT151,Term
TermII,II09/10
14-15
STAT306,
26
© Zhenlin Yang, SMU