Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect.
Download
Report
Transcript Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect.
Basics on Probability
Jingrui He
09/11/2007
Coin Flips
You flip a coin
Head with probability 0.5
You flip 100 coins
How many heads would you expect
Coin Flips cont.
You flip a coin
Head with probability p
Binary random variable
Bernoulli trial with success probability p
You flip k coins
How many heads would you expect
Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables
Random variables (RVs) which may take on
only a countable number of distinct values
E.g. the total number of heads X you get if you
flip 100 coins
X is a RV with arity k if it can take on exactly
one value out of x1 , , xk
E.g. the possible values that X can take on are 0,
1, 2,…, 100
Probability of Discrete RV
Probability mass function (pmf): P X xi
Easy facts about pmf
PX x 1
i
i
P X x X x P X x P X x if i j
i
j
i
j
P X xi X x j 0 if i j
P X x1 X x2
X xk 1
Common Distributions
Uniform X U 1, , N
X takes values 1, 2, …, N
P X i 1 N
E.g. picking balls of different colors from a box
Binomial X
Bin n, p
X takes values 0, 1, …, n
n i
n i
P X i p 1 p
i
E.g. coin flips
Coin Flips of Two Persons
Your friend and you both flip coins
Head with probability 0.5
You flip 50 times; your friend flip 100 times
How many heads will both of you get
Joint Distribution
Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together
E.g. P(You get 21 heads AND you friend get 70
heads)
x
y
P X x Y y 1
E.g.
50
100
i 0
j 0
P You get i heads AND your friend get j heads 1
Conditional Probability
P X x Y y is the probability of X x ,
given the occurrence of Y y
E.g. you get 0 heads, given that your friend gets
61 heads
P X x Y y
P X x Y y
PY y
Law of Total Probability
Given two discrete RVs X and Y, which take
values in x1 , , xm and y1 , , yn , We have
P X x Y y
P X x Y y P Y y
P X xi
j
i
j
i
j
j
j
Marginalization
Marginal Probability
Joint Probability
P X x Y y
P X x Y y P Y y
P X xi
j
i
j
i
Conditional Probability
j
j
j
Marginal Probability
Bayes Rule
X and Y are discrete RVs…
P X x Y y
P X xi Y y j
P X x Y y
PY y
P Y y j X xi P X xi
P Y y
k
j
X xk P X xk
Independent RVs
Intuition: X and Y are independent means that
X x neither makes it more or less probable
that Y y
Definition: X and Y are independent iff
P X x Y y P X x P Y y
More on Independence
P X x Y y P X x P Y y
P X x Y y P X x
P Y y X x P Y y
E.g. no matter how many heads you get, your
friend will not be affected, and vice versa
Conditionally Independent RVs
Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
P X x Y y Z z P X x Z z P Y y Z z
More on Conditional Independence
P X x Y y Z z P X x Z z P Y y Z z
P X x Y y, Z z P X x Z z
P Y y X x, Z z P Y y Z z
Monty Hall Problem
You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?
Host reveals
Goat A
or
Host reveals
Goat B
Host must
reveal Goat B
Host must
reveal Goat A
Monty Hall Problem: Bayes Rule
C i : the car is behind door i, i = 1, 2, 3
P Ci 1 3
H ij : the host opens door j after you pick door i
P H ij Ck
i j
0
0
jk
ik
1 2
1 i k , j k
Monty Hall Problem: Bayes Rule cont.
WLOG, i=1, j=3
P C1 H13
P H13
P H13 C1 P C 1
P H13
1 1 1
C1 P C1
2 3 6
Monty Hall Problem: Bayes Rule cont.
P H13 P H13 , C1 P H13 , C2 P H13 , C3
P H13 C1 P C1 P H13 C2 P C2
1
1
1
6
3
1
2
16 1
P C1 H13
12 3
Monty Hall Problem: Bayes Rule cont.
16 1
P C1 H13
12 3
1 2
P C2 H13 1 P C1 H13
3 3
You should switch!
Continuous Random Variables
What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function f x that describes the
probability density in terms of the input
variable x.
PDF
Properties of pdf
f x 0, x
f x 1
f x 1 ???
Actual probability can be obtained by taking
the integral of pdf
E.g. the probability of X being between 0 and 1 is
P 0 X 1
1
0
f x dx
Cumulative Distribution Function
FX v P X v
Discrete RVs
FX v P X vi
v
Continuous RVs
i
v
FX v
f x dx
d
FX x f x
dx
Common Distributions
N , 2
Normal X
f x
2
x
1
exp
, x
2
2
2
E.g. the height of the entire population
0.4
0.35
0.3
0.25
f(x)
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
x
1
2
3
4
5
Common Distributions cont.
Beta X
Beta ,
1
1
1
x 1 x , x 0,1
f x; ,
B ,
1 : uniform distribution between 0 and 1
E.g. the conjugate prior for the parameter p in
Binomial distribution
1.6
1.4
1.2
1
f(x)
0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
x
0.6
0.7
0.8
0.9
1
Joint Distribution
Given two continuous RVs X and Y, the joint
pdf can be written as f X,Y x, y
x y
f X,Y x, y dxdy 1
Multivariate Normal
Generalization to higher dimensions of the
one-dimensional normal
Covariance Matrix
f X x1 ,
, xd
1
2
d 2
12
T 1
1
exp x x
2
Mean
Moments
Mean (Expectation): E X
Discrete RVs: E X vi P X vi
v
i
Continuous RVs: E X
xf x dx
Variance: V X E X 2
2
Discrete RVs: V X
v vi P X vi
i
Continuous RVs: V X
x f x dx
2
Properties of Moments
Mean
E aX aE X
E XY E X E Y
If X and Y are independent, E XY E X E Y
Variance
V aX b a2V X
If X and Y are independent, V X Y V (X) V (Y)
Moments of Common Distributions
Uniform X U 1, , N
Binomial X
Bin n, p
N ,
2
Mean ; variance 2
Beta X
Mean np ; variance np 2
Normal X
Mean 1 N 2 ; variance N 2 1 12
Beta ,
Mean ; variance
1
2
Probability of Events
X denotes an event that could possibly happen
P(X) denotes the likelihood that X happens,
or X=true
E.g. X=“you will fail in this course”
What’s the probability that you will fail in this
course?
denotes the entire event set
X, X
The Axioms of Probabilities
0 <= P(X) <= 1
P 1
P X1 X2
disjoint events
Useful rules
i P Xi , where Xi are
P X 1 P X
P X1 X2 P X1 P X2 P X1 X2
Interpreting the Axioms
X1
X2