Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.
Download ReportTranscript Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.
Hidden Markov models
Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected]
Oct 16 th , 2014
Key concepts
• • What are Hidden Markov models (HMMs)?
– States, Emission characters, Parameters Three important questions in HMMs and algorithms to solve these questions – Probability of a sequence observations: Forward algorithm – Most likely path (sequence of states): Viterbi – Parameter estimation: Baum-Welch/Forward backward algorithm
Revisiting the CpG question
• • • Given a sequence
x 1 ..x
T
we can use two Markov chains to decide if
x 1 ..x
T
is a CpG island or not.
What do we do if we were asked to “find” the CpG islands in the genome?
We have to search for these “islands” in a “sea” of non-islands
A simple HMM for identifying CpG islands
A+ G+ A G C+ T+ C Background T CpG Island Note, there is no more one-to-one correspondence between states and observed symbols
An HMM for an occasionally dishonest casino 0.95
0.9
1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 0.05
0.1
1
Fair
1 1/10 2 1/10 3 1/10 4 1/10 5 1/10 6 1/2 2
Loaded What is hidden? Which dice is rolled What is observed? Number (1-6) on the die
What does an HMM do?
• • • Enables us to model observed sequences of characters generated by a hidden dynamic system The system can exist in a fixed number of “hidden” states The system
probabilistically transitions
between states and at each state it
emits
symbol/character a
Formally defining a HMM
• • • States Emission alphabet Parameters – State transition probabilities for probabilistic transitions from state at time
t
to state at time
t+1
– Emission probabilities for probabilistically emitting symbols from a state
Notation
• • • • • States with emissions will be numbered from
1
to
K
–
0
begin state,
N
end state observed character at position
t
Observed sequence Hidden state sequence or path Transition probabilities • Emission probabilities: Probability of emitting symbol
b
from state
k
An example HMM
probability of a transition from state 1 to state 3 probability of emitting character A in state 2
0.2
0.4
0.5
A 0.4
C 0.1
G 0.2
T 0.3
1
begin
0 0.5
0.8
A 0.4
C 0.1
G 0.1
T 0.4
2 0.8
0.2
A 0.2
C 0.3
G 0.3
T 0.2
3 A 0.1
C 0.4
G 0.4
T 0.1
4 0.1
0.6
0.9
end
5
Path notation
0.2
0.4
begin
0
P
( AAC , p
0.5
A 0.4
C 0.1
G 0.2
T 0.3
1 0.8
A 0.2
C 0.3
G 0.3
T 0.2
3 0.6
end
5 0.5
) =
0.8
a
01
A 0.4
C 0.1
G 0.1
T 0.4
2 0.2
A 0.1
C 0.4
G 0.4
T 0.1
4 0.1
´
e
1 ( A ) ´
a
11 ´
e
1 ( A ) ´
0.9
a
13 ´
e
3 ( C ) ´
a
35 = 0.5
´ 0.4
´ 0.2
´ 0.4
´ 0.8
´ 0.3
´ 0.6
Three important questions in HMMs
• • • How likely is an HMM to have generated a given sequence?
–
Forward algorithm
What is the most likely “path” for generating a sequence of observations –
Viterbi algorithm
How can we learn an HMM from a set of sequences?
–
Forward-backward or Baum-Welch (an EM algorithm)
How likely is a given sequence from an HMM?
Initial transition Emitting symbol
x t
State transition between consecutive time points
How likely is a given sequence from an HMM?
• • • But we don’t know what the path is So we need to sum over all paths The probability over all paths is:
Example
• Consider an candidate CpG island •
CGCGC
Considering our HMM for CpG island model, some possible paths that are consistent with this CpG island are C + G + C + G + C + C G C G C C G + C G + C -
Number of paths
• for a sequence of length
T
, how many possible paths through this HMM are there?
2 T
A 0.4
C 0.1
G 0.2
T 0.3
1
begin
0
end
3 A 0.4
C 0.1
G 0.1
T 0.4
2
• the Forward algorithm enables us to compute the probability of a sequence by efficiently summing over all possible paths
How likely is a given sequence: Forward algorithm
• Define as the probability of observing and ending in state
k
at time
t
• This can be written recursively as follows
Steps of the Forward algorithm
• Initialization: denote 0 for the “begin” state • Recursion: for
t=1
to
T
• Termination
Forward algorithm example
0.4
0.2
0.5
A 0.4
C 0.1
G 0.2
T 0.3
1
begin
0 0.5
0.8
A 0.4
C 0.1
G 0.1
T 0.4
2 0.8
A 0.2
C 0.3
G 0.3
T 0.2
3 0.2
0.1
A 0.1
C 0.4
G 0.4
T 0.1
4 0.6
0.9
end
5
What is the probability of sequence TAGA ?
In class exercise
Table for TAGA
States 0 1 2 3 4 5 1 0 0 0 0 0 t=1 T 0 0.15
0.2
0 0 0 t=2 A 0 0.012
0.064
0.024
0.004
0 t=3 G 0 0.00048
0.00512
0.00576
0.00528
0 t=4 A 0 0.0000384
0.0016384
0.0005376
0.0001552
0.00046224
This entry is also P(x). Does not require f 1 (4) and f 2 (4)
Three important questions in HMMs
• • • How likely is an HMM to have generated a given sequence?
–
Forward algorithm
What is the most likely “path” for generating a sequence of observations –
Viterbi algorithm
How can we learn an HMM from a set of sequences?
–
Forward-backward or Baum-Welch (an EM algorithm)
Viterbi algorithm
• • • Viterbi algorithm gives an efficient way to find the most likely/probable state Consider the dishonest casino example – Given a sequence of dice rolls can you infer when the casino was using the loaded versus fair dice?
– Viterbi algorithm gives this answer Viterbi is very similar to the Forward algorithm – Except instead of summing we maximize
Notation for Viterbi
• • •
x t
probability of the most likely path for
x 1 ..
ending in state
k
pointer to the state that gave the maximizing transition is the most probability path sequence
x 1 ,..,x T
Steps of the Viterbi algorithm
• Initialization: • Recursion:for
t=1
to
T
• Termination: Probability associated with the most likely path
Traceback in Viterbi
• Traceback for
t=T
to
1
Viterbi algorithm example
0.4
0.2
0.5
A 0.4
C 0.1
G 0.2
T 0.3
1
begin
0 0.5
0.8
A 0.4
C 0.1
G 0.1
T 0.4
2 0.8
A 0.2
C 0.3
G 0.3
T 0.2
3 0.2
0.1
A 0.1
C 0.4
G 0.4
T 0.1
4 0.6
0.9
end
5
What is the most likely path for TAG ?
In class exercise
3 4 0 1 2 5 0 0 1 0 0 0
Viterbi computations for TAG
t=1 T 0 0.15
0.2
0 0 0
v k (t)
t=2 A 0 0.012
0.064
0.024
0.004
0 t=3 G 0 0.00048
0.00512
0.00288
0.00512
0.004608
1 2 3 4 t=1 T 0 0 0 0 t=2 A 0 0 1 2
ptr t (k)
0 0 3 2 t=3 G Trace back path
Using HMM to detect CpG islands
• • • Recall the 8-state HMM for our CpG island Apply the Viterbi algorithm to a DNA sequence on this HMM Contiguous assignments of ‘+’ states will correspond to CpG islands
Summary
• • • Hidden Markov models are extensions to Markov chains enabling us to model and segment sequence data HMMs are defined by a set of states and emission characters, transition probabilities and emission probabilities We have examined two questions for HMMs – Computing the probability of a sequence of observed characters given an HMM (Forward algorithm) – Computing the most likely sequence of states (or path) for a sequence of observed characters