Instrument Recognition in Polyphonic Music

Download Report

Transcript Instrument Recognition in Polyphonic Music

Instrument Recognition in
Polyphonic Music
Jana Eggink
Supervisor: Guy J. Brown
University of Sheffield
[email protected]
Previous Work
• Missing feature approach
• Frequency regions in which partials of a non-target tone were
found were excluded from the recognition process
• Requires the knowledge of all F0s
• Worked well only for low numbers of simultaneous F0s
time
HOARSE 20.02.04
time
d) mixture with mask
frequency
c) mixture
frequency
b) interfering tone
frequency
frequency
a) target tone
time
Jana Eggink: Instrument Recognition in Polyphonic Music
time
2 / 12
New Approach
• Spectral energy of instrument sounds is concentrated in their
harmonics
• Which are therefore less likely to be masked by interfering
sounds
• Build a recogniser based on harmonics only to minimise
mismatch between training and test data
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
3 / 12
Instrument Recognition in Accompanied
Sonatas and Concertos
• A solo instrument is commonly played louder than the
accompaniment
• Causing the corresponding harmonic series to stand out in a
spectral representation (hopefully)
• Requires only the extraction of the most prominent F0, which will
most often belong to the solo instrument
flute
clarinet
audio
signal
spectral
peaks
F0 and
harmonics
features
oboe
violin
cello
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
4 / 12
Find Spectral Peaks
power
• Convolve spectrum with a differentiated Gaussian
• Spectrum is smoothed and peaks are transformed into zero
crossings which are easy to detect
• The frequency of a peak is defined by the frequency of the
corresponding FFT bin (for more accuracy a highly zero-padded
FFT is used)
frequency
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
5 / 12
Find Most Prominent F0
power
• Pattern matching using ‘harmonic sieves’
• One sieve for every possible F0, with ‘slots’ for every partial
• The more spectral peaks pass through the slots of a sieve, the more
likely the F0
• Problem: octave confusions
• Solution: for every sieve compute a ‘match’ as the sum of the
weighted power of all allocated partials, with higher weights for
lower partials
frequency
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
6 / 12
F0 Restriction
• Especially for woodwinds, many of the estimated F0s were below
the range of the instrument
• These F0s were either erroneous or belonged to the accompaniment,
the latter inevitable in sections were the solo instrument is silent
• Only the highest 50% of all estimated F0s were used for instrument
recognition
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
7 / 12
Compute Features
• Frequency and power of first 15 partials
• Frame to frame differences (deltas and delta-deltas) within tones
of continuous F0
frequency
partials 220


+2
+1
HOARSE 20.02.04
442
-1
0
658
+5
-1
power
...
...
...
60
0
0
50
+5
+3
44
-3
+1
Jana Eggink: Instrument Recognition in Polyphonic Music
...
...
...
8 / 12
Instrument Recognition
• Gaussian mixture models (GMMs), trained on solo music and
isolated tone samples
• One model for every F0 of every instrument
• Very homogeneous training data:
• only one centre per model needed
• very fast convergence during training
• Recognition efficient because models can be restricted to those
trained on the current F0
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
9 / 12
Results I
• Isolated tone samples (monophonic)
• Average recognition accuracy: 67%
confusion matrix isolated tones:
response
flute
flute
76%
clarinet 16%
oboe
6%
violin
5%
cello
3%
stimulus
HOARSE 20.02.04
clarinet
9%
64%
16%
1%
2%
oboe
3%
9%
57%
5%
7%
violin
12%
8%
13%
71%
21%
cello
1%
2%
7%
18%
68%
Jana Eggink: Instrument Recognition in Polyphonic Music
10 / 12
Results II
• Realistic monophonic phrases, 2-10 sec.: 84% correct
• Solo instrument with accompaniment (piano or
orchestra), 2-3 min.: 86% correct
confusion matrix accompanied solo instruments:
response
flute
flute
75%
clarinet 6%
oboe
0%
violin
0%
cello
0%
stimulus
HOARSE 20.02.04
clarinet
0%
88%
0%
0%
0%
oboe
0%
0%
82%
0%
0%
violin
25%
6%
18%
88%
6%
cello
0%
0%
0%
12%
92%
Jana Eggink: Instrument Recognition in Polyphonic Music
11 / 12
Conclusions and Future Work
• Recognition accuracy comparable to that of other systems
designed to deal with monophonic music only
• Phrases were classified better than isolated tones, which is a
common phenomenon, in longer and more varied examples
isolated random errors are more likely to be evened out
• No drop in recognition accuracy between monophonic phrases
and those with accompaniment
• Very good results when averaging over whole sound files, but
not accurate enough for note-by-note classifications
• Use knowledge about the solo instrument to extract the
melody line
• Distinction between solo instrument present / silent necessary
HOARSE 20.02.04
Jana Eggink: Instrument Recognition in Polyphonic Music
12 / 12