Transcript PhD Exam
PART IV: EXTENSIONS AND DISCUSSION Linear System Parameterization • Linear System Parameterization Any musical instrument is controlled by several independent parameters For example: – A flute is controlled by covering holes – A violin is controlled by the length of the strings – The speech system is controlled by the position of the tongue, lips, teeth, etc. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 2 \35 Linear System Parameterization • Linear System Parameterization Each system can be seen as a “black box” configured by controlling parameters – We observe the output of the system – The sound depends on the system configuration – Our goal: recover the controlling parameters Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 3 \35 Linear System Parameterization • Parameter Recovery We show that the parameters are conveyed by the eigenvectors obtained via the diffusion framework [Talmon et al., TSP 12’] Example: [Rabin, 10’] Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 4 \35 Linear System Parameterization • Temporal Evolution Example – violin recordings: – Several samples of the same tone – Each sample is slightly different due to different finger placements on the string Temporal evolution of the controlling parameters – Perceptual system variation (large scale) – Small fluctuations regime (small scale) Formulating the stochastic dynamic as Ito processes [Talmon et al., TSP 12’] Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 5 \35 Linear System Parameterization • Main Components Improved and Extendable Kernel Feature Selection Local Metric Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 6 \35 Linear System Parameterization • Feature Selection Use the covariance function of the observed signal – Circumvents the dependency on the realization of the input signal – Robust and generic Formulated as a nonlinear function of the parameters [Talmon et al., TSP 12’] – Enables to present the stochastic dynamics using the Ito lemma Example – AR process: – Observe the AR system of order 1: with zero mean white excitation and a controlling parameter (pole) – The covariance of the observable signal nonlinearly depends on the controlling parameter: – Viewed as a Cohen, Gannot and Talmon function: PART IV: EXTENSIONS AND DISCUSSION 7 \35 Linear System Parameterization • Local Metric Euclidean distance in the parametric space [Singer & Coifman, 08’] Let be two points in the parametric space and let be their mapping to the feature space The Euclidean distance in the parameter space can be approx. by where and are the local covariance matrices of the features The corresponding Gaussian kernel converges to a diffusion operator No natural extension is available Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 8 \35 Linear System Parameterization • Local Metric Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 9 \35 Linear System Parameterization • Improved and Extendable Kernel New Kernel [Kushnir et al., 11’] Use the following metric Define affinity The original kernel Extended kernel Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 10 \35 Linear System Parameterization • Improved and Extendable Kernel Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 11 \35 Linear System Parameterization • Improved and Extendable Kernel Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 12 \35 Linear System Parameterization • Improved and Extendable Kernel Let be the SVD of We have In addition: – are the eigenvectors of – are the eigenvectors of The eigenvectors represent the independent parameters Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 13 \35 Linear System Parameterization • Experimental Results Example – AR process – Accurate recovery of the poles of AR processes – Enables to capture the actual degrees of freedom when the poles are dependent – May be viewed as natural parameterization of speech signals Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 14 \35 Supervised Source Localization • Overview Consider a reverberant room with a single source and a single microphone – Whether localization can be performed based on the measurement? Prior recordings from various known locations – It becomes a problem of matching – How to incorporate the prior recordings Source Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 15 \35 Supervised Source Localization • Concept View the acoustic channel as a “black box” controlled by the environmental factors – Room dimensions, reflection coefficients, position of microphone and source Assume the environmental factors are fixed (except the position of the source) – – The prior recordings may be relevant – The source location is the single controlling parameter “Localization” – recover the controlling parameter by observing the outputs of the black box Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 16 \35 Supervised Source Localization • Notation Let denote an acoustic impulse response between the microphone and a source, at relative position – and – is the distance between the source and the microphone. Cohen, Gannot and Talmon are the azimuth and elevation direction of arrival (DOA) angles PART IV: EXTENSIONS AND DISCUSSION 17 \35 Supervised Source Localization • Training Data We pick predefined positions of the source From each position, we play an arbitrary input signal, and record the signal picked up with the microphone where and impulse response Cohen, Gannot and Talmon are the input and output signals of the room corresponding to the source location PART IV: EXTENSIONS AND DISCUSSION 18 \35 Supervised Source Localization • Training Data We repeat the training from each source location times – However, the position of the source is slightly perturbed. Let Let denote the small perturbations of . be the input and output signals corresponding to the repeated measurements. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 19 \35 Supervised Source Localization • Algorithm Input The input of the algorithm is a new measurement of an arbitrary unknown input signal from an unknown source position Goal Recover the source position given the measured signal – Capitalize the prior training information Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 20 \35 Supervised Source Localization • Feature Extraction The measured signal heavily depends on the unknown input signal – Consequently, the information on the position of the source is weakly disclosed by the “raw” time domain measurements To overcome this challenge, we want to compute features – better convey the position – less dependent on the particular input signal Proposed features – covariance elements The covariance function of the output signal is given by where Cohen, Gannot and Talmon and denote the covariance functions of PART IV: EXTENSIONS AND DISCUSSION and 21 \35 Supervised Source Localization • Feature Extraction For each measurement, we compute a feature vector consisting of elements of the covariance Let , , and denote the covariance elements of – Geometrically the vectors , , and are viewed as a “cloud” of points around in Used to estimate the local covariance matrix Data summary Training – Input – Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 22 \35 Supervised Source Localization • Training Stage We compute an affinity matrix between the training vectors where is the kernel scale Recall: Key point: The proposed kernel enables to capture the actual variability in terms of the source position Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 23 \35 Supervised Source Localization • Training Stage Let and kernel be the eigenvalues and eigenvectors of the . There exist eigenvectors that represent the data in terms of its independent parameters We assume these eigenvectors correspond to the largest eigenvalues – these independent parameters represent the desired position coordinates of the source The bottom line The spectral representation of the kernel recovers the source position Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 24 \35 Supervised Source Localization • Test Stage Extend the spectral representation to the new measurement – Compute weights – Calculate new spectral coordinate Let be the embedding of the measurements onto the space spanned by the extended vectors corresponding to the source position Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 25 \35 Supervised Source Localization • Test Stage We show that the map recovers the position of the source up to a monotonic distortion – The monotonic character enables the map to organize the measurements according to the values of the source position coordinates Estimate the position: interpolate the training positions according to distances in the embedded space where consists of the -nearest training measurements in the embedded space, and Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 26 \35 Supervised Source Localization • Algorithm Summary Training Stage: – Obtain training measurements, and calculate the corresponding feature vectors – Given “clouds” of additional measurements, estimate local covariances – Compute the affinity kernel and apply EVD – Detect the relevant eigenvectors Test Stage: – Obtain a new measurement, and calculate the corresponding feature vector – Extend the spectral representation to the new measurement – Construct the embedding – Recover the new position by interpolating the training positions according to the embedding “organization” Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 27 \35 Supervised Source Localization • Recording Setup An acoustic room in Bar-Ilan University of dimensions – Reverberations controlled by double-sided panels (either absorbing or reflecting) – The panels arranged to yield reverberation time of Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 28 \35 Supervised Source Localization • Recording Setup Inside the room – An omnidirectional microphone in a fixed location – A long “arm” was connected to the base of the microphone, and attached to a turntable that controls the horizontal angle. – A mouth-simulator was located on the far-end of the arm Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 29 \35 Supervised Source Localization • Recording Setup Recordings – DOA angles of – of a zero-mean and unit-variance white Gaussian noise sampled at spacing was played from the mouth-simulator – The movement of the arm along angles was repeated times Organization – Due to small perturbations of the long arm, the exact location is not maintained during the entire period – Each measurement was divided into segments – Segments from the same DOA angle are viewed as a “cloud” of points Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 30 \35 Supervised Source Localization • Source Localization Results – The eigenvector recovers the DOA – Accurate estimation of the DOA Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 31 \35 Supervised Source Localization • Summary Presented a supervised algorithm for source localization using a diffusion kernel Based solely on single-channel recordings Applied a manifold learning approach to incorporate prior measurements Experimental results conducted in a real reverberant environment showed accurate estimation of the source direction of arrival Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 32 \35 Supervised Source Localization • Future Work Extend the algorithm for noisy environments Investigate the influence of environmental changes following the training stage – When furniture are added or when people are moving around the speaker – How relevant the prior recordings are? Incorporate the prior information in traditional (multichannel) methods – Can we achieve improved results? Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 33 \35 Conclusions • Conclusions Combining geometric information is beneficial – Better performance – Efficient and simple – Insight and analysis The presented methods enable capturing of – Transient interferences – Acoustic parameters – Artificial signals (AR processes) Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 34 \35 Conclusions • Future Work Further improve the methods – Support more complicated signals (e.g. speech and music) Develop filtering/processing framework – Supervised – Nonlinear – Incorporate temporal information – Handle measurement noise Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 35 \35 • References – Part I M. Berouti, R. Schwartz, and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc. of ICASSP-79, pp. 208–211. S. F. Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech, Signal Process., vol. 27, no. 2, pp. 113–120, Apr. 1979. D. Burshtein and S. Gannot, Speech enhancement using a mixture-maximum model, IEEE Trans. Speech Audio Processing, vol. 10, pp. 341–351, Sept. 2002. I. Cohen, Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, IEEE Trans. Speech Audio Process., vol. 11, no. 5, pp. 466–475, Sep. 2003. I. Cohen, Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models,” Signal Process., vol. 68, no. 4, pp. 698–709, Apr. 2006. I. Cohen and B. Berdugo, Speech enhancement for non-stationary environments, Signal Process., vol. 81, pp. 2403–2418, Nov. 2001. I. Cohen and B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal Processing Lett., vol. 9, pp. 12–15, Jan. 2002. Y. Ephraim and D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109–1121, Dec. 1984. Y. Ephraim and D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 2, pp. 443–445, Apr. 1985. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 36 • References – Part I Y. Ephraim and H. L. Van Trees, A signal subspace approach for speech enhancement, IEEE Trans. Speech Audio Process., vol. 3, no. 4, pp. 251–266, Jul. 1995. Y. Hu and P. C. Loizou, A generalized subspace approach for enhancing speech corrupted by colored noise,” IEEE Trans. Speech Audio Process., vol. 11, no. 4, pp. 334–341, Jul. 2003. S. Gannot, D. Burshtein, and E. Weinstein, Iterative and sequential Kalman filter-based speech enhancement algorithms, IEEE Trans. Speech Audio Processing, vol. 6, pp. 373-385, Aug. 1998. J. S. Lim and A.V. Oppenheim, Enhancement and bandwidth compression of noisy speech, Proc. of the IEEE, vol. 67, pp. 15861604, Dec. 1979. R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Process., vol. 9, no. 5, pp. 504–512, Jul. 2001. [Online] Code available: http://webee.technion.ac.il/people/IsraelCohen/ Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 37 • References – Part II S. Godsill, Digital Audio Restoration - a statistical model based approach. Springer-Verlag London, 1998. E. A. P. Habets, S. Gannot, and I. Cohen, Late reverberant spectral variance estimation based on a statistical model, IEEE Signal Processing Letters, vol. 16, no. 9, pp. 770–773, Sep. 2009. R. Talmon, I. Cohen, and S. Gannot, Speech enhancement in transient noise environment using diffusion filtering, in Proc. ICASSP2010. R. Talmon, I. Cohen, and S. Gannot, Transient noise reduction using nonlocal diffusion filters, IEEE Trans. Audio, Speech Lang. Process., vol 19, no.6, pp.1584-1599, Aug. 2011. R. Talmon, I. Cohen, and S. Gannot, Clustering and suppression of transient noise in speech signals using diffusion maps, in Proc. ICASSP-2011. R. Talmon, I. Cohen, and S. Gannot, Single-channel transient interference suppression with Diffusion Maps, submitted, 2011. S. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, 3rd ed. John Wiley & Sons Ltd., 2006. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 38 • References – Part III D. Barash, A fundamental relationship between bilateral filtering, adaptive smoothing, and the nonlinear diffusion equation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 844–847, 2002. M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., vol. 15, pp. 1373–1396, 2003. A. Buades, B. Coll, and J. M. Morel, A review of image denoising algorithms, with a new one, Multiscale Model. Simul., vol. 4, pp. 490–530, 2005. R. Coifman and S. Lafon, Diffusion maps, Appl. Comput. Harmon. Anal., vol. 21, pp. 5–30, Jul. 2006. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker, Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps, Proc. Nat. Acad. Sci., vol. 102, no. 21, pp. 7426–7431, May 2005. D. L. Donoho and C. Grimes, Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data, Proc. Nat. Acad. Sci., vol. 100, pp. 5591–5596, 2003. S. Lafon, Y. Keller, and R. R. Coifman, Data fusion and multicue data matching by diffusion maps, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1784–1797, Nov. 2006. B. J. Matkowsky and Z. Schuss, Eigenvalues of the Fokker-Planck operator and the approach to equilibrium for diffusions in potential fields, SIAM Journal of Applied Math., vol. 40, pp. 242–254, 1981. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 39 • References – Part III B. Nadler and M. Galun, Fundamental limitations of spectral clustering, Neural Information Process. Systems (NIPS), vol. 19, 2006. B. Nadler, S. Lafon, R. Coifman, and I. G. Kevrekidis, Diffusion maps, spectral clustering and reaction coordinates of dynamical systems, Appl. Comput. Harmon. Anal., pp. 113–127, 2006. S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 260, pp. 2323–2326, 2000. B. Scholkopf, A. Smola, and K. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., vol. 10, pp. 1299–1319, 1996. J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000. A. Singer, Y. Shkolnisky, and B. Nadler, Diffusion interpretation of non local neighborhood filters for signal denoising, SIAM Journal Imaging Sciences, vol. 2, no. 1, pp. 118–139, 2009. R. Talmon, I. Cohen, S. Gannot, and R.R. Coifman, Supervised graph-based processing for sequential transient interference suppression, submitted, 2012. J. B. Tenenbaum, V. de Silva, and J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, vol. 260, pp. 2319–2323, 2000. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 40 • References – Part IV J. B. Allen and D. A. Berkley, Image method for efficiently simulating small room acoustics, Journal of the Acoustical Society of America, vol. 65, no. 4, p. 943-950, 1979. D. Kushnir, A. Haddad, and R. Coifman, Anisotropic diffusion on sub-manifolds with application to earth structure classification, vol. 32, no. 2, pp. 280-294, 2012. A. Singer, Spectral independent component analysis, Appl. Comput. Harmon. Anal., vol. 21, pp. 128–134, 2006. A. Singer and R. Coifman, Non-linear independent component analysis with diffusion maps, Appl. Comput. Harmon. Anal., vol. 25, pp. 226–239, 2008. A. Singer, R. Erban, I. G. Kevrekidis, and R. Coifman, Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps, Proc. Nat. Acad. Sci., vol. 106, no. 38, pp. 16 090–1605, 2009. R. Talmon, I. Cohen, and S. Gannot, Supervised source localization using diffusion kernels, in Proc. WASPAA-2011. R. Talmon, D. Kushnir, R. R. Coifman, I. Cohen, and S. Gannot, Parametrization of linear systems using diffusion kernels, IEEE Trans. Signal Process., vol. 60, no. 3, pp. 1159-1173, Mar. 2012. [online] E. A. P. Habets, Room impulse response (RIR) generator, Available: http://home.tiscali.nl/ehabets/rir generator.html. [online] Code available: http://users.math.yale.edu/rt294/. Cohen, Gannot and Talmon PART IV: EXTENSIONS AND DISCUSSION 41