Transcript PhD Exam

PART IV: EXTENSIONS AND DISCUSSION
Linear System Parameterization
• Linear System Parameterization
Any musical instrument is controlled by several independent parameters
For example:
– A flute is controlled by covering holes
– A violin is controlled by the length of the strings
– The speech system is controlled by the position of the tongue, lips, teeth, etc.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
2
\35
Linear System Parameterization
• Linear System Parameterization
Each system can be seen as a “black box” configured by controlling
parameters
– We observe the output of the system
– The sound depends on the system configuration
– Our goal: recover the controlling parameters
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
3
\35
Linear System Parameterization
• Parameter Recovery
We show that the parameters are conveyed by the eigenvectors obtained
via the diffusion framework [Talmon et al., TSP 12’]
Example:
[Rabin, 10’]
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
4
\35
Linear System Parameterization
• Temporal Evolution
Example – violin recordings:
– Several samples of the same tone
– Each sample is slightly different due to different finger placements on the string
Temporal evolution of the controlling parameters
– Perceptual system variation (large scale)
– Small fluctuations regime (small scale)
Formulating the stochastic dynamic as Ito processes [Talmon et al., TSP 12’]
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
5
\35
Linear System Parameterization
• Main Components
Improved
and
Extendable
Kernel
Feature
Selection
Local Metric
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
6
\35
Linear System Parameterization
• Feature Selection
Use the covariance function of the observed signal
– Circumvents the dependency on the realization of the input signal
– Robust and generic
Formulated as a nonlinear function of the parameters
[Talmon et al., TSP 12’]
– Enables to present the stochastic dynamics using the Ito lemma
Example – AR process:
– Observe the AR system of order 1:
with zero mean white excitation and a controlling parameter (pole)
– The covariance of the observable signal nonlinearly depends on the controlling
parameter:
– Viewed as a
Cohen, Gannot and Talmon
function:
PART IV: EXTENSIONS AND DISCUSSION
7
\35
Linear System Parameterization
• Local Metric
Euclidean distance in the parametric space [Singer & Coifman, 08’]
Let
be two points in the parametric space and let
be their mapping to the feature space
The Euclidean distance in the parameter space can be approx. by
where
and
are the local covariance matrices of the features
The corresponding Gaussian kernel
converges to a diffusion operator
No natural extension is available
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
8
\35
Linear System Parameterization
• Local Metric
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
9
\35
Linear System Parameterization
• Improved and Extendable Kernel
New Kernel [Kushnir et al., 11’]
Use the following metric
Define affinity
The original kernel
Extended kernel
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
10
\35
Linear System Parameterization
• Improved and Extendable Kernel
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
11
\35
Linear System Parameterization
• Improved and Extendable Kernel
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
12
\35
Linear System Parameterization
• Improved and Extendable Kernel
Let
be the SVD of
We have
In addition:
–
are the eigenvectors of
–
are the eigenvectors of
The eigenvectors represent the independent parameters
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
13
\35
Linear System Parameterization
• Experimental Results
Example – AR process
– Accurate recovery of the poles of AR processes
– Enables to capture the actual degrees of freedom when the poles are
dependent
– May be viewed as natural parameterization of speech signals
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
14
\35
Supervised Source Localization
• Overview
Consider a reverberant room with a single source and a single microphone
– Whether localization can be performed based on the measurement?
Prior recordings from various known locations
– It becomes a problem of matching
– How to incorporate the prior recordings
Source
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
15
\35
Supervised Source Localization
• Concept
View the acoustic channel as a “black box” controlled by the environmental factors
– Room dimensions, reflection coefficients, position of microphone and source
Assume the environmental factors are fixed (except the position of the source) –
– The prior recordings may be relevant
– The source location is the single controlling parameter
“Localization” – recover the controlling parameter by observing the outputs of the
black box
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
16
\35
Supervised Source Localization
• Notation
Let
denote an acoustic impulse response between the microphone
and a source, at relative position
–
and
–
is the distance between the source and the microphone.
Cohen, Gannot and Talmon
are the azimuth and elevation direction of arrival (DOA) angles
PART IV: EXTENSIONS AND DISCUSSION
17
\35
Supervised Source Localization
• Training Data
We pick
predefined positions of the source
From each position, we play an arbitrary input signal, and record the
signal picked up with the microphone
where
and
impulse response
Cohen, Gannot and Talmon
are the input and output signals of the room
corresponding to the source location
PART IV: EXTENSIONS AND DISCUSSION
18
\35
Supervised Source Localization
• Training Data
We repeat the training from each source location
times
– However, the position of the source is slightly perturbed.
Let
Let
denote the small perturbations of
.
be the input and output signals corresponding to
the repeated measurements.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
19
\35
Supervised Source Localization
• Algorithm Input
The input of the algorithm is a new measurement of an arbitrary
unknown input signal
from an unknown source position
Goal
Recover the source position given the measured signal
– Capitalize the prior training information
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
20
\35
Supervised Source Localization
• Feature Extraction
The measured signal heavily depends on the unknown input signal
– Consequently, the information on the position of the source is weakly
disclosed by the “raw” time domain measurements
To overcome this challenge, we want to compute features
– better convey the position
– less dependent on the particular input signal
Proposed features – covariance elements
The covariance function of the output signal is given by
where
Cohen, Gannot and Talmon
and
denote the covariance functions of
PART IV: EXTENSIONS AND DISCUSSION
and
21
\35
Supervised Source Localization
• Feature Extraction
For each measurement, we compute a feature vector consisting of
elements of the covariance
Let ,
, and
denote the covariance elements of
– Geometrically the vectors
,
, and
are viewed as a “cloud” of points around
in
Used to estimate the local covariance matrix
Data summary
Training –
Input –
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
22
\35
Supervised Source Localization
• Training Stage
We compute an affinity matrix between the training vectors
where
is the kernel scale
Recall:
Key point:
The proposed kernel enables to capture the actual variability in terms of
the source position
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
23
\35
Supervised Source Localization
• Training Stage
Let
and
kernel
be the eigenvalues and eigenvectors of the
.
There exist eigenvectors that represent the data in terms of its
independent parameters
We assume these eigenvectors correspond to the largest eigenvalues
– these independent parameters represent the desired position coordinates of
the source
The bottom line
The spectral representation of the kernel recovers the source position
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
24
\35
Supervised Source Localization
• Test Stage
Extend the spectral representation to the new measurement
– Compute weights
– Calculate new spectral coordinate
Let
be the embedding of the measurements onto the space spanned
by the extended vectors corresponding to the source position
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
25
\35
Supervised Source Localization
• Test Stage
We show that the map
recovers the position of the source up to a
monotonic distortion
– The monotonic character enables the map to organize the measurements
according to the values of the source position coordinates
Estimate the position: interpolate the training positions according to
distances in the embedded space
where
consists of the -nearest training measurements in the embedded
space, and
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
26
\35
Supervised Source Localization
• Algorithm Summary
Training Stage:
– Obtain
training measurements, and calculate the corresponding feature vectors
– Given “clouds” of additional measurements, estimate local covariances
– Compute the affinity kernel and apply EVD
– Detect the relevant eigenvectors
Test Stage:
– Obtain a new measurement, and calculate the corresponding feature vector
– Extend the spectral representation to the new measurement
– Construct the embedding
– Recover the new position by interpolating the training positions according to the
embedding “organization”
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
27
\35
Supervised Source Localization
• Recording Setup
An acoustic room in Bar-Ilan University of dimensions
– Reverberations controlled by
double-sided panels (either absorbing or
reflecting)
– The panels arranged to yield reverberation time of
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
28
\35
Supervised Source Localization
• Recording Setup
Inside the room
– An omnidirectional microphone in a fixed location
– A
long “arm” was connected to the base of the microphone,
and attached to a turntable that controls the horizontal angle.
– A mouth-simulator was located on the far-end of the arm
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
29
\35
Supervised Source Localization
• Recording Setup
Recordings
–
DOA angles of
–
of a zero-mean and unit-variance white Gaussian noise sampled at
spacing
was played from the mouth-simulator
– The movement of the arm along
angles was repeated
times
Organization
– Due to small perturbations of the long arm, the exact location is not
maintained during the entire
period
– Each measurement was divided into segments
– Segments from the same DOA angle are viewed as a “cloud” of points
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
30
\35
Supervised Source Localization
• Source Localization
Results
– The eigenvector recovers the DOA
– Accurate estimation of the DOA
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
31
\35
Supervised Source Localization
• Summary
Presented a supervised algorithm for source localization using a diffusion
kernel
Based solely on single-channel recordings
Applied a manifold learning approach to incorporate prior measurements
Experimental results conducted in a real reverberant environment
showed accurate estimation of the source direction of arrival
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
32
\35
Supervised Source Localization
• Future Work
Extend the algorithm for noisy environments
Investigate the influence of environmental changes following the training
stage
– When furniture are added or when people are moving around the speaker
– How relevant the prior recordings are?
Incorporate the prior information in traditional (multichannel) methods
– Can we achieve improved results?
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
33
\35
Conclusions
• Conclusions
Combining geometric information is beneficial
– Better performance
– Efficient and simple
– Insight and analysis
The presented methods enable capturing of
– Transient interferences
– Acoustic parameters
– Artificial signals (AR processes)
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
34
\35
Conclusions
• Future Work
Further improve the methods
– Support more complicated signals (e.g. speech and music)
Develop filtering/processing framework
– Supervised
– Nonlinear
– Incorporate temporal information
– Handle measurement noise
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
35
\35
• References – Part I
M. Berouti, R. Schwartz, and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc. of ICASSP-79, pp. 208–211.
S. F. Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech, Signal Process., vol. 27,
no. 2, pp. 113–120, Apr. 1979.
D. Burshtein and S. Gannot, Speech enhancement using a mixture-maximum model, IEEE Trans. Speech Audio Processing, vol. 10,
pp. 341–351, Sept. 2002.
I. Cohen, Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, IEEE Trans.
Speech Audio Process., vol. 11, no. 5, pp. 466–475, Sep. 2003.
I. Cohen, Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models,” Signal
Process., vol. 68, no. 4, pp. 698–709, Apr. 2006.
I. Cohen and B. Berdugo, Speech enhancement for non-stationary environments, Signal Process., vol. 81, pp. 2403–2418, Nov.
2001.
I. Cohen and B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal
Processing Lett., vol. 9, pp. 12–15, Jan. 2002.
Y. Ephraim and D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109–1121, Dec. 1984.
Y. Ephraim and D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE
Trans. Acoust., Speech, Signal Process., vol. 33, no. 2, pp. 443–445, Apr. 1985.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
36
• References – Part I
Y. Ephraim and H. L. Van Trees, A signal subspace approach for speech enhancement, IEEE Trans. Speech Audio Process., vol. 3,
no. 4, pp. 251–266, Jul. 1995.
Y. Hu and P. C. Loizou, A generalized subspace approach for enhancing speech corrupted by colored noise,” IEEE Trans. Speech
Audio Process., vol. 11, no. 4, pp. 334–341, Jul. 2003.
S. Gannot, D. Burshtein, and E. Weinstein, Iterative and sequential Kalman filter-based speech enhancement algorithms, IEEE
Trans. Speech Audio Processing, vol. 6, pp. 373-385, Aug. 1998.
J. S. Lim and A.V. Oppenheim, Enhancement and bandwidth compression of noisy speech, Proc. of the IEEE, vol. 67, pp. 15861604, Dec. 1979.
R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech
Audio Process., vol. 9, no. 5, pp. 504–512, Jul. 2001.
[Online] Code available: http://webee.technion.ac.il/people/IsraelCohen/
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
37
• References – Part II
S. Godsill, Digital Audio Restoration - a statistical model based approach. Springer-Verlag London, 1998.
E. A. P. Habets, S. Gannot, and I. Cohen, Late reverberant spectral variance estimation based on a statistical model, IEEE Signal
Processing Letters, vol. 16, no. 9, pp. 770–773, Sep. 2009.
R. Talmon, I. Cohen, and S. Gannot, Speech enhancement in transient noise environment using diffusion filtering, in Proc. ICASSP2010.
R. Talmon, I. Cohen, and S. Gannot, Transient noise reduction using nonlocal diffusion filters, IEEE Trans. Audio, Speech Lang.
Process., vol 19, no.6, pp.1584-1599, Aug. 2011.
R. Talmon, I. Cohen, and S. Gannot, Clustering and suppression of transient noise in speech signals using diffusion maps, in Proc.
ICASSP-2011.
R. Talmon, I. Cohen, and S. Gannot, Single-channel transient interference suppression with Diffusion Maps, submitted, 2011.
S. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, 3rd ed. John Wiley & Sons Ltd., 2006.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
38
• References – Part III
D. Barash, A fundamental relationship between bilateral filtering, adaptive smoothing, and the nonlinear diffusion equation, IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 844–847, 2002.
M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., vol. 15, pp.
1373–1396, 2003.
A. Buades, B. Coll, and J. M. Morel, A review of image denoising algorithms, with a new one, Multiscale Model. Simul., vol. 4, pp.
490–530, 2005.
R. Coifman and S. Lafon, Diffusion maps, Appl. Comput. Harmon. Anal., vol. 21, pp. 5–30, Jul. 2006.
R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker, Geometric diffusions as a tool for harmonic
analysis and structure definition of data: diffusion maps, Proc. Nat. Acad. Sci., vol. 102, no. 21, pp. 7426–7431, May 2005.
D. L. Donoho and C. Grimes, Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data, Proc. Nat.
Acad. Sci., vol. 100, pp. 5591–5596, 2003.
S. Lafon, Y. Keller, and R. R. Coifman, Data fusion and multicue data matching by diffusion maps, IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 28, no. 11, pp. 1784–1797, Nov. 2006.
B. J. Matkowsky and Z. Schuss, Eigenvalues of the Fokker-Planck operator and the approach to equilibrium for diffusions in
potential fields, SIAM Journal of Applied Math., vol. 40, pp. 242–254, 1981.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
39
• References – Part III
B. Nadler and M. Galun, Fundamental limitations of spectral clustering, Neural Information Process. Systems (NIPS), vol. 19, 2006.
B. Nadler, S. Lafon, R. Coifman, and I. G. Kevrekidis, Diffusion maps, spectral clustering and reaction coordinates of dynamical
systems, Appl. Comput. Harmon. Anal., pp. 113–127, 2006.
S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 260, pp. 2323–2326,
2000.
B. Scholkopf, A. Smola, and K. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., vol. 10, pp.
1299–1319, 1996.
J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905,
Aug. 2000.
A. Singer, Y. Shkolnisky, and B. Nadler, Diffusion interpretation of non local neighborhood filters for signal denoising, SIAM Journal
Imaging Sciences, vol. 2, no. 1, pp. 118–139, 2009.
R. Talmon, I. Cohen, S. Gannot, and R.R. Coifman, Supervised graph-based processing for sequential transient interference
suppression, submitted, 2012.
J. B. Tenenbaum, V. de Silva, and J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science,
vol. 260, pp. 2319–2323, 2000.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
40
• References – Part IV
J. B. Allen and D. A. Berkley, Image method for efficiently simulating small room acoustics, Journal of the Acoustical Society of
America, vol. 65, no. 4, p. 943-950, 1979.
D. Kushnir, A. Haddad, and R. Coifman, Anisotropic diffusion on sub-manifolds with application to earth structure classification,
vol. 32, no. 2, pp. 280-294, 2012.
A. Singer, Spectral independent component analysis, Appl. Comput. Harmon. Anal., vol. 21, pp. 128–134, 2006.
A. Singer and R. Coifman, Non-linear independent component analysis with diffusion maps, Appl. Comput. Harmon. Anal., vol. 25,
pp. 226–239, 2008.
A. Singer, R. Erban, I. G. Kevrekidis, and R. Coifman, Detecting intrinsic slow variables in stochastic dynamical systems by
anisotropic diffusion maps, Proc. Nat. Acad. Sci., vol. 106, no. 38, pp. 16 090–1605, 2009.
R. Talmon, I. Cohen, and S. Gannot, Supervised source localization using diffusion kernels, in Proc. WASPAA-2011.
R. Talmon, D. Kushnir, R. R. Coifman, I. Cohen, and S. Gannot, Parametrization of linear systems using diffusion kernels, IEEE
Trans. Signal Process., vol. 60, no. 3, pp. 1159-1173, Mar. 2012.
[online] E. A. P. Habets, Room impulse response (RIR) generator, Available: http://home.tiscali.nl/ehabets/rir generator.html.
[online] Code available: http://users.math.yale.edu/rt294/.
Cohen, Gannot and Talmon
PART IV: EXTENSIONS AND DISCUSSION
41