Transcript PPT

A Probabilistic Model for
Melody Segmentation
By Miguel Ferrand, Peter Nelson,
and Geraint Wiggins
Outlines
Overview of this model
N-gram models and Entropy
A case study
Compare with the experiment from real
listeners
Discussion
Overview
A probabilistic approach to predict
segmentation boundaries in melodies
No knowledge of music theories is used in
this model, pure mathematic method
Use entropy as a measure of
unpredictability of music features
Guess that segmentation boundaries will
appear at the changes of entropy
N-gram Models (1)
N-gram grammar (Nth order Markov
model): P of occurrence of a symbol
depends on the prior occurrence of n -1
other symbols.
The probability of sequence s = w1…wl of
length l (wji: wi…wj, n: the order)
N-gram Model (2)
 Problems:
Data sparseness: some P(wi | …) = 0
Longer sequences will have lower counts if training corpus is
small
Use linear interpolation smoothing method,
Take tri-gram for example,
P(wk | wk-3, wk-2, wk-1) = λ1P(wk) + λ2P(wk | wk-1) + λ3P(wk | wk-2,
wk-1),
where λ1 + λ2 + λ3 = 1 and λ1 < λ2 < λ3
Entropy
 For an N-gram model M, entropy Hc(M)
associated with context c, (e is all possible
successor symbol of c)
P(e | c) is calculated from linear interpolation smoothing
method.
Low entropy usually means high predictability.
A case study (1)
Deliège’s experiment
Subjects listened to a melody and had to
identify segmentation points in real-time. (Use
the solo for English Horn, from Tristan and
Isolde by Wagner)
Subjects are both musically trained and
untrained.
Found 8 main segment boudaries
A case study (2)
Translate melody information to eventbased representation
Pitch Step (PS): interval distance to following
event in semitones
Pitch Contour (PC): the sign of PS, {-1, +1, 0}
Duration Ratio (DR): DR of the present and
following event
Duration Contour (DC): the change of DR; -1 if
DR >1; 1 if DR < 1; 0 if DR = 1
A case study (3)
A case study (4)
Tri-gram, bi-gram and uni-gram model was
generated for PS, PC, DR and DC.
Standard deviation of entropy is calculated
with sliding window (size = 10)
Results
A case study (5)
A case study (6)
Result
 Duration based features have a much higher
entropy variance than pitch based features.
Therefore time based features are more likely to
convey more information for segmentation.
 Distinct changes in entropy happened to be
melody segment boundaries indicated by
listeners.
Discussion
N-gram model might be over-simplified for
music sequences.
A state depends only on the previous states.
However, human’s memory is not infinite,
either.
The ability to establish large-span temporal
relations is limited.