Transcript Slides
Maximum Likelihood Estimate
Jyh-Shing Roger Jang (張智星)
CSIE Dept, National Taiwan University
Intro. to Maximum Likelihood Estimate
MLE
Goal:
Maximum likelihood estimate
Given a dataset with no labels, how can we find the best model
with the optimum parameters to describe the data?
Applications
Prediction
Analysis
2
What Are Models?
Models are used to describe the probabilities of random
variables
Discrete variables Probabilities
Continuous variables Probability density functions (PDF)
Examples
Discrete variables
The outcome of tossing a coin or a dice
Continuous variables
The distance to the bull eye when throwing a dart
The time needed to run 100-m dash
The heights of second-grade students
Personalized PDF!
3
More about Models
Discrete variables
Outcome of tossing a coin Pr{head}=1/2, Pr{tail}=1/2
Continuous variables
Distance to the bull’s eye when throwing a dart A PDF
of Gaussian or normal distribution
1 x 2
1
g x; ,
exp
2
2
2
Prx 4,6 g ( x; , )dx
6
Quiz!
4
Probability of x in [4, 6]
4
Basic Steps in MLE
Steps
1.
2.
3.
4.
Perform a certain experiment to collect the data.
Choose a parametric model of the data, with certain
modifiable parameters.
Formulate the likelihood as an objective function to be
maximized.
Maximize the objective function and derive the parameters
of the model.
Examples
Toss a coin To find the probabilities of head and tail
Throw a dart To find your PDF of distance to the bull eye
Sample a group of animals To find the quantity of animals
5
Probability Model for Discrete Variables
Toss an unfair coin 5 times to get 3 heads and 2 tails
Our intuition: Pr{head}=3/5, Pr{tail}=2/5
By MLE
Assume these 5 tosses are independent events to have the
overall probability
J p, q p 3q 2 , with p q 1, p 0, q 0
J ( p) p 3 1 p
J ( p)
0
p
p 3 / 5, q 2 / 5
2
6
Inequality of Arithmetic and Geometric Means
AM-GM inequality
Quiz!
n
x
1/ n
n
i 1
xi , with xi 0, i
n
i 1
The equality holds only when x1 x2 xn .
i
Proof of this inequality
Wikipedia
How to use the inequality to solve MLE problem?
7
Probability Model for Discrete Variables
Toss a 3-side die for many times and obtain n1 of side 1,
n2 of side 2, and n3 of side 3, then what is the most
likely probabilities for sides 1, 2, and 3, respectively?
Our intuition…
By MLE…
J p, q, r p n1 q n2 r n3 , with p q r 1, p 0, q 0, r 0
8
MLE for PDF of Continuous Variables of 1D
Detailed coverage
PDF
1 x 2
1
g x; ,
exp
2
2
2
X x1 , x2 xn
n
p X ; , 2 g xi ; , 2
Overall PDF,
or likelihood
i 1
J ( , 2 ) ln p X ; , 2
n
ln g xi ; , 2
i 1
n
lng xi ; ,
i 1
2
Log likelihood
2
n
1
1 x
ln 2 ln i
2
i 1
2
n
1 n x
ln 2 n ln i
2
2 i 1
2
MLE
J ( , 2 ) n xi
1 n
ˆ
0
xi
n i 1
i 1
J ( , 2 )
n n x xi
1 n
2
2
ˆ
i
0
xi ˆ
2
i 1
n i 1
9
MLE for PDF of Continuous Variables of ND
Detailed coverage
g x; ,
1
2 d
PDF
1
T
exp x 1 x
2
Overall PDF,
or likelihood
X x1 , x2 xn
n
p X ; , g xi ; ,
i 1
Log likelihood
J ( , ) ln p X ; ,
n
ln g xi ; ,
i 1
n
ln g xi ; ,
i 1
n
1
1
d
T
ln 2 ln xi 1 xi
2
2
2
i 1
nd
n
1 n
T
ln 2 ln xi 1 xi
2
2
2 i 1
1 n
2 1 xi
2 i 1
n
1 xi n
i 1
n
1
J ( , ) 0 ˆ xi
n i 1
J ( , )
1 n
xi xi
n i 1
MLE
T
10
Q&A
Questions
Can we choose other probability models instead of
Gaussian/normal distributions? Yes!
What are the other PDF available?
11