A PRESENTATION ON Voice Morphing G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.

Download Report

Transcript A PRESENTATION ON Voice Morphing G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.

G.S.MOZE COLLEGE OF ENGINNERING
BALEWADI,PUNE -45.
A PRESENTATION ON
Voice Morphing
PROJECT GUIDE :
By:
Anil Mahadik
Prof. Sonali Ghote
Content
 Title
 Introduction
 History
 Need of Vocal track area function
 Vocal track area function
 AR-HMM Analysis
 AR-HMM Diagram
 Re-synthesis of Converted voice
 Training Phase
 Conversion and morphing phase
 Application
 Conclusion
 References
Title
 The Project title is “Voice Morphing”.
 Give the information about Flexible Voice Morphing
based on linear combination of multispeakers’ vocal
tract area function.
 Voice morphing or voice conversion usually means
transformation from a source speaker’s speech to a
target speaker’s.
Introduction
 The main goal of the developed audio morphing
methods is the smooth transformation from one
sound to another.
 These techniques are considered to be a kind of
point-to-point mapping in a feature space.
 There are many applications which may benefit from
this sort of technology.
 Research on voice morphing aims to extend this
restriction to area-to-area mapping by introducing
multi-speakers .
History
 Voice morphing is a technology developed at the
Los Alamos National Laboratory in New Mexico,
USA by George Papcun and publicly
demonstrated in 1999.
 Voice morphing enables speech patterns to be
cloned and an accurate copy of a person's voice
be made which can then say anything the
operator wishes it to say.
Need of Vocal track area function
 Since the 1990s, many techniques for voice
conver-sion have been proposed [1-7].
 One successful technique is to use a statistical
method for mapping a source speaker’s voice to a
target speaker’s but a weakness of these methods is
the discontinuity of formants.
 The proposed method employs an estimated vocal
tract area function to avoid such weakness.
Vocal Tract area function(A)
 Interpolation in the vocal tract area domain is
considered to provide reasonably continuous
transition of formants.
 Estimation of the vocal tract area function implies
simultaneous estimation of the voice source
characteristics.
AR-HMM analysis
 For this purpose of Estimation of the vocal tract
area function introduce Auto-Regressive Hidden
Markov Model (AR-HMM) analysis of speech.
 The AR-HMM model represents the vocal tract
characteristics by an AR model and the glottal source
wave by an HMM.
 The AR-HMM analysis estimates the vocal tract
resonance characteristics and vocal source waves in
the sense of maximum likelihood estimation.
Diagram of AR-HMM
Re-synthesis of the converted
voice
 There are two phase’s Training phase and
Conversion & Morphing phase.
 The procedure of each phase is as follow
in Diagram.
Training phase
 AR-HMM analysis: Speech samples with the same
phonetic content from both source and target
speaker are analyzed .
 Feature alignment: The feature vectors obtained
above are time-aligned using dynamic time warping
(DTW) in order to compensate for any differences in
duration between source and target utterances.
 Estimation of the conversion function: The aligned
vectors are used to train a joint GMM whose
parameters are then used to construct a stochastic
conversion function.
Training phase
Conversion and morphing phase
 AR-HMM analysis: In this case only the source
speaker’s utterances are used.
 Features Transformation: The GMM-based transformation function constructed during training is now
used for converting every source log vocal tract area
function and vocal cord cepstrum into its most likely
target equivalent.
 Linear Interpolation ,Synthesis of the source wave
and LPC synthesis.
Conversion and morphing phase
Application
 Applications as the creation of peculiar voices in
animation films.
 Voice morphing has tremendous possibilities in
military psychological warfare and subversion.
 Voice morphing is a powerful battlefield weapon
which can be used to provide fake orders to the
enemy's troops, appearing to come from their own
commanders.
Conclusion
 This paper has presented a voice morphing method
based on mappings in the vocal tract area space and
glottal source wave spectrum that can each be
independently mod-ified.
 These features have been realized using AR-HMM
analysis of speech.
 In future, we will investigate how to improve the
quality of voice conversion with interpolation
techniques.
References
 [1] L.M. Arslan, D.Talkin, ”Voice conversion by
codebook map-ping of line spectral frequencies and
excitation spectrum,” Proc. Eurospeech, pp.13471350, 1997.
 [2] Y.Stylianou, O.Cappe, “A system voice conversion
based on probabilistic classification and a harmonic
plus noise mod-el”, Proc.ICASSP, pp.281-284, 1998 .
 [3] A.Kain, “Spectral voice conversion for text-tospeech syn-thesis”, Proc.ICASSP pp.285-288, 1998.
 [4] H. Ye, S. Young, “High Quality Voice Morphing”, in
Proc.IEEEICASSP, pp.9-12, 2004.