A PRESENTATION ON Voice Morphing G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.
Download
Report
Transcript A PRESENTATION ON Voice Morphing G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.
G.S.MOZE COLLEGE OF ENGINNERING
BALEWADI,PUNE -45.
A PRESENTATION ON
Voice Morphing
PROJECT GUIDE :
By:
Anil Mahadik
Prof. Sonali Ghote
Content
Title
Introduction
History
Need of Vocal track area function
Vocal track area function
AR-HMM Analysis
AR-HMM Diagram
Re-synthesis of Converted voice
Training Phase
Conversion and morphing phase
Application
Conclusion
References
Title
The Project title is “Voice Morphing”.
Give the information about Flexible Voice Morphing
based on linear combination of multispeakers’ vocal
tract area function.
Voice morphing or voice conversion usually means
transformation from a source speaker’s speech to a
target speaker’s.
Introduction
The main goal of the developed audio morphing
methods is the smooth transformation from one
sound to another.
These techniques are considered to be a kind of
point-to-point mapping in a feature space.
There are many applications which may benefit from
this sort of technology.
Research on voice morphing aims to extend this
restriction to area-to-area mapping by introducing
multi-speakers .
History
Voice morphing is a technology developed at the
Los Alamos National Laboratory in New Mexico,
USA by George Papcun and publicly
demonstrated in 1999.
Voice morphing enables speech patterns to be
cloned and an accurate copy of a person's voice
be made which can then say anything the
operator wishes it to say.
Need of Vocal track area function
Since the 1990s, many techniques for voice
conver-sion have been proposed [1-7].
One successful technique is to use a statistical
method for mapping a source speaker’s voice to a
target speaker’s but a weakness of these methods is
the discontinuity of formants.
The proposed method employs an estimated vocal
tract area function to avoid such weakness.
Vocal Tract area function(A)
Interpolation in the vocal tract area domain is
considered to provide reasonably continuous
transition of formants.
Estimation of the vocal tract area function implies
simultaneous estimation of the voice source
characteristics.
AR-HMM analysis
For this purpose of Estimation of the vocal tract
area function introduce Auto-Regressive Hidden
Markov Model (AR-HMM) analysis of speech.
The AR-HMM model represents the vocal tract
characteristics by an AR model and the glottal source
wave by an HMM.
The AR-HMM analysis estimates the vocal tract
resonance characteristics and vocal source waves in
the sense of maximum likelihood estimation.
Diagram of AR-HMM
Re-synthesis of the converted
voice
There are two phase’s Training phase and
Conversion & Morphing phase.
The procedure of each phase is as follow
in Diagram.
Training phase
AR-HMM analysis: Speech samples with the same
phonetic content from both source and target
speaker are analyzed .
Feature alignment: The feature vectors obtained
above are time-aligned using dynamic time warping
(DTW) in order to compensate for any differences in
duration between source and target utterances.
Estimation of the conversion function: The aligned
vectors are used to train a joint GMM whose
parameters are then used to construct a stochastic
conversion function.
Training phase
Conversion and morphing phase
AR-HMM analysis: In this case only the source
speaker’s utterances are used.
Features Transformation: The GMM-based transformation function constructed during training is now
used for converting every source log vocal tract area
function and vocal cord cepstrum into its most likely
target equivalent.
Linear Interpolation ,Synthesis of the source wave
and LPC synthesis.
Conversion and morphing phase
Application
Applications as the creation of peculiar voices in
animation films.
Voice morphing has tremendous possibilities in
military psychological warfare and subversion.
Voice morphing is a powerful battlefield weapon
which can be used to provide fake orders to the
enemy's troops, appearing to come from their own
commanders.
Conclusion
This paper has presented a voice morphing method
based on mappings in the vocal tract area space and
glottal source wave spectrum that can each be
independently mod-ified.
These features have been realized using AR-HMM
analysis of speech.
In future, we will investigate how to improve the
quality of voice conversion with interpolation
techniques.
References
[1] L.M. Arslan, D.Talkin, ”Voice conversion by
codebook map-ping of line spectral frequencies and
excitation spectrum,” Proc. Eurospeech, pp.13471350, 1997.
[2] Y.Stylianou, O.Cappe, “A system voice conversion
based on probabilistic classification and a harmonic
plus noise mod-el”, Proc.ICASSP, pp.281-284, 1998 .
[3] A.Kain, “Spectral voice conversion for text-tospeech syn-thesis”, Proc.ICASSP pp.285-288, 1998.
[4] H. Ye, S. Young, “High Quality Voice Morphing”, in
Proc.IEEEICASSP, pp.9-12, 2004.