Introduction to Programming

Download Report

Transcript Introduction to Programming

Probabilistic
Graphical
Models
Representation
Bayesian Networks
Reasoning
Patterns
Daphne Koller
The Student Network
d0
d1
0.6
0.4
i0
0.7
Difficulty
i0,d0
i0,d1
i1,d0
i1,d1
g1
0.3
0.05
0.9
0.5
g2
0.4
0.25
0.08
0.3
g3
0.3
0.7
0.02
0.2
i1
0.3
Intelligence
Grade
SAT
i0
i1
s0
0.95
0.2
s1
0.05
0.8
Letter
g1
g2
g3
l0
0.1
0.4
0.99
l1
0.9
0.6
0.01
Daphne Koller
Causal Reasoning
Intelligence
Difficulty
Grade
SAT
P(l1) ~ 0.5
P(l1 | i0 ) ~
Letter
P(l1 | i0 , d0) ~
Daphne Koller
Evidential Reasoning
P(d1) = 0.4
P(d1 | g3) ≈
P(i1) = 0.3
P(i1 | g3) ≈
Difficulty
Student gets a C 
i0,d0
i0,d1
i1,d0
i1,d1
g1
0.3
0.05
0.9
0.5
g2
0.4
0.25
0.08
0.3
g3
0.3
0.7
0.02
0.2
Intelligence
Grade
SAT
Letter
Daphne Koller
We find out that class is hard
• What happens to the posterior
probability of high intelligence?
Goes up
Goes down
Class is hard!
Difficulty
Grade
SAT
Student gets a C 
Doesn’t change
We can’t know
Intelligence
Letter
Intercausal Reasoning
P(d1) = 0.4
P(d1 | g3) ≈ 0.63
Class is hard!
P(i1) = 0.3
P(i1 | g3) ≈ 0.08
P(i1 | g3, d1) ≈ 0.11
Difficulty
Student gets a C 
Intelligence
Grade
SAT
Letter
Daphne Koller
Intercausal Reasoning II
P(i1) = 0.3
P(i1 | g2) ≈
P(i1 | g2, d1) ≈
Difficulty
Intelligence
Class is hard!
Student gets a B 
Grade
SAT
Letter
Daphne Koller
Daphne Koller
Student Aces the SAT
• What happens to the posterior
probability that the class is hard?
Goes up
Goes down
Doesn’t change
We can’t know
Difficulty
Intelligence
Grade
SAT
Student gets a C 
Student aces the SAT 
Letter
Multiple Evidence
P(d1) = 0.4
P(d1 | g3) ≈ 0.63
P(d1 | g3, s1) ≈
P(i1) = 0.3
P(i1 | g3) ≈ 0.08
P(i1 | g3, s1) ≈
Difficulty
Student gets a C 
Intelligence
Grade
Letter
SAT
Student aces the SAT 
Daphne Koller
END
Daphne Koller
Daphne Koller
Daphne Koller
Daphne Koller
Daphne Koller
Daphne Koller
Suppose q is at a local minimum of a
function. What will one iteration of
gradient descent do?
Leave q unchanged.
Change q in a random direction.
Move q towards the global minimum of J(q).
Decrease q.
Consider the weight update:
Which of these is a correct vectorized implementation?
Fig. A corresponds to a=0.01, Fig. B to a=0.1, Fig. C to a=1.
Fig. A corresponds to a=0.1, Fig. B to a=0.01, Fig. C to a=1.
Fig. A corresponds to a=1, Fig. B to a=0.01, Fig. C to a=0.1.
Fig. A corresponds to a=1, Fig. B to a=0.1, Fig. C to a=0.01.