Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ sroy@biostat.wisc.edu Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.

Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.

Transcript Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.

Hidden Markov models

Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected]

Oct 16 th , 2014

Key concepts

• • What are Hidden Markov models (HMMs)?

– States, Emission characters, Parameters Three important questions in HMMs and algorithms to solve these questions – Probability of a sequence observations: Forward algorithm – Most likely path (sequence of states): Viterbi – Parameter estimation: Baum-Welch/Forward backward algorithm

Revisiting the CpG question

• • • Given a sequence

x 1 ..x

we can use two Markov chains to decide if

x 1 ..x

is a CpG island or not.

What do we do if we were asked to “find” the CpG islands in the genome?

We have to search for these “islands” in a “sea” of non-islands

A simple HMM for identifying CpG islands

A+ G+ A G C+ T+ C Background T CpG Island Note, there is no more one-to-one correspondence between states and observed symbols

An HMM for an occasionally dishonest casino 0.95

0.9

1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 0.05

0.1

Fair

1 1/10 2 1/10 3 1/10 4 1/10 5 1/10 6 1/2 2

Loaded What is hidden? Which dice is rolled What is observed? Number (1-6) on the die

What does an HMM do?

• • • Enables us to model observed sequences of characters generated by a hidden dynamic system The system can exist in a fixed number of “hidden” states The system

probabilistically transitions

between states and at each state it

emits

symbol/character a

Formally defining a HMM

• • • States Emission alphabet Parameters – State transition probabilities for probabilistic transitions from state at time

to state at time

t+1

– Emission probabilities for probabilistically emitting symbols from a state

Notation

• • • • • States with emissions will be numbered from

–

begin state,

end state observed character at position

Observed sequence Hidden state sequence or path Transition probabilities • Emission probabilities: Probability of emitting symbol

from state

An example HMM

probability of a transition from state 1 to state 3 probability of emitting character A in state 2

0.2

0.4

0.5

A 0.4

C 0.1

G 0.2

T 0.3

begin

0 0.5

0.8

A 0.4

C 0.1

G 0.1

T 0.4

2 0.8

0.2

A 0.2

C 0.3

G 0.3

T 0.2

3 A 0.1

C 0.4

G 0.4

T 0.1

4 0.1

0.6

0.9

end

Path notation

0.2

0.4

begin

( AAC , p

0.5

A 0.4

C 0.1

G 0.2

T 0.3

1 0.8

A 0.2

C 0.3

G 0.3

T 0.2

3 0.6

end

5 0.5

) =

0.8

A 0.4

C 0.1

G 0.1

T 0.4

2 0.2

A 0.1

C 0.4

G 0.4

T 0.1

4 0.1

1 ( A ) ´

11 ´

1 ( A ) ´

0.9

13 ´

3 ( C ) ´

35 = 0.5

´ 0.4

´ 0.2

´ 0.4

´ 0.8

´ 0.3

´ 0.6

Three important questions in HMMs

• • • How likely is an HMM to have generated a given sequence?

–

Forward algorithm

What is the most likely “path” for generating a sequence of observations –

Viterbi algorithm

How can we learn an HMM from a set of sequences?

–

Forward-backward or Baum-Welch (an EM algorithm)

How likely is a given sequence from an HMM?

Initial transition Emitting symbol

x t

State transition between consecutive time points

How likely is a given sequence from an HMM?

• • • But we don’t know what the path is So we need to sum over all paths The probability over all paths is:

Example

• Consider an candidate CpG island •

CGCGC

Considering our HMM for CpG island model, some possible paths that are consistent with this CpG island are C + G + C + G + C + C G C G C C G + C G + C -

Number of paths

• for a sequence of length

, how many possible paths through this HMM are there?

2 T

A 0.4

C 0.1

G 0.2

T 0.3

begin

end

3 A 0.4

C 0.1

G 0.1

T 0.4

• the Forward algorithm enables us to compute the probability of a sequence by efficiently summing over all possible paths

How likely is a given sequence: Forward algorithm

• Define as the probability of observing and ending in state

at time

• This can be written recursively as follows

Steps of the Forward algorithm

• Initialization: denote 0 for the “begin” state • Recursion: for

t=1

• Termination

Forward algorithm example

0.4

0.2

0.5

A 0.4

C 0.1

G 0.2

T 0.3

begin

0 0.5

0.8

A 0.4

C 0.1

G 0.1

T 0.4

2 0.8

A 0.2

C 0.3

G 0.3

T 0.2

3 0.2

0.1

A 0.1

C 0.4

G 0.4

T 0.1

4 0.6

0.9

end

What is the probability of sequence TAGA ?

In class exercise

Table for TAGA

States 0 1 2 3 4 5 1 0 0 0 0 0 t=1 T 0 0.15

0.2

0 0 0 t=2 A 0 0.012

0.064

0.024

0.004

0 t=3 G 0 0.00048

0.00512

0.00576

0.00528

0 t=4 A 0 0.0000384

0.0016384

0.0005376

0.0001552

0.00046224

This entry is also P(x). Does not require f 1 (4) and f 2 (4)

Three important questions in HMMs

• • • How likely is an HMM to have generated a given sequence?

–

Forward algorithm

What is the most likely “path” for generating a sequence of observations –

Viterbi algorithm

How can we learn an HMM from a set of sequences?

–

Forward-backward or Baum-Welch (an EM algorithm)

Viterbi algorithm

• • • Viterbi algorithm gives an efficient way to find the most likely/probable state Consider the dishonest casino example – Given a sequence of dice rolls can you infer when the casino was using the loaded versus fair dice?

– Viterbi algorithm gives this answer Viterbi is very similar to the Forward algorithm – Except instead of summing we maximize

Notation for Viterbi

• • •

x t

probability of the most likely path for

x 1 ..

ending in state

pointer to the state that gave the maximizing transition is the most probability path sequence

x 1 ,..,x T

Steps of the Viterbi algorithm

• Initialization: • Recursion:for

t=1

• Termination: Probability associated with the most likely path

Traceback in Viterbi

• Traceback for

t=T

Viterbi algorithm example

0.4

0.2

0.5

A 0.4

C 0.1

G 0.2

T 0.3

begin

0 0.5

0.8

A 0.4

C 0.1

G 0.1

T 0.4

2 0.8

A 0.2

C 0.3

G 0.3

T 0.2

3 0.2

0.1

A 0.1

C 0.4

G 0.4

T 0.1

4 0.6

0.9

end

What is the most likely path for TAG ?

In class exercise

3 4 0 1 2 5 0 0 1 0 0 0

Viterbi computations for TAG

t=1 T 0 0.15

0.2

0 0 0

v k (t)

t=2 A 0 0.012

0.064

0.024

0.004

0 t=3 G 0 0.00048

0.00512

0.00288

0.00512

0.004608

1 2 3 4 t=1 T 0 0 0 0 t=2 A 0 0 1 2

ptr t (k)

0 0 3 2 t=3 G Trace back path

Using HMM to detect CpG islands

• • • Recall the 8-state HMM for our CpG island Apply the Viterbi algorithm to a DNA sequence on this HMM Contiguous assignments of ‘+’ states will correspond to CpG islands

Summary

• • • Hidden Markov models are extensions to Markov chains enabling us to model and segment sequence data HMMs are defined by a set of states and emission characters, transition probabilities and emission probabilities We have examined two questions for HMMs – Computing the probability of a sequence of observed characters given an HMM (Forward algorithm) – Computing the most likely sequence of states (or path) for a sequence of observed characters

Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.

Transcript Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16th, 2014 Key concepts • What are Hidden Markov models (HMMs)? – States, Emission characters, Parameters • Three.