- 中国语音学报(第11辑)
- 中国社会科学院语言研究所主办
- 685字
- 2025-03-31 07:31:29
7.REFERENCES
[1]King,S.,et al.2007.Speech production knowledge in automatic speech recognition.J.Acoust.Soc.Am.,121(2):p.723-742.
[2]Badin,P.,et al.2010.Can you ‘read’ tongue movements? Evaluation of the contribution of tongue display to speech understanding.Speech Communication,52(6):p.493-503.
[3]Youssef,A.B.,et al.2011.Towards a multi-speaker visual articulatory feedback system,in InterSpeech2011.p.589-592.
[4]Lewis,J.1991.Automated lip-sync:Background and techniques.The Journal of Visualization and Computer Animation,2(4):p.118-122.
[5]Bregler,C.,Covell,M.and Slaney,M.1997.Video rewrite:Driving visual speech with audio,in the 24th annual conference on Computer graphics and interactive techniques.p.353-360.
[6]Hofer,G.and Richmond,K.2010.Comparison of HMM and TMDN methods for lip synchronisation,in InterSpeech2010.
[7]Schroeter,J.and Sondhi,M.M.1992.Speech coding based on physiological models of speech production,in Advances in Speech Signal Processing,S.Furui and M.M.Sondhi,Editors.Marcel Dekker Inc.:New York.p.231-268.
[8]Atal,B.S.,et al.1978.Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique.J.Acoust.Soc.Am.,63(5):p.1535-1555.
[9]Ouni,S.and Laprie,Y.2005 Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.J.Acoust.Soc.Am.,118(1):p.444-460.
[10]Schroeter,J.and Sondhi,M.M.1994.Techniques for estimating vocal-tract shapes from the speech signal.IEEE Trans.Speech Audio Processing,2:p.133-150.
[11]Roweis,S.,Data driven production models for speech processing.1999,California Institute of Technology.
[12]Hiroya,S.and Honda,M.2004.Estimation of Articulatory Movements from Speech Acoustics Using an HMM-Based Speech Production Model.IEEE Transactions on Speech and Audio Processing,12(2):p.175-185.
[13]Zhang,L.and Renals,S.2008.Acoustic-articulatory modeling with the trajectory HMM.IEEE Signal Process Letter,15:p.245-248.
[14]Ling,Z.,Richmond,K.and Yamagishi,J.2010.An Analysis of HMM-based prediction of articulatory movements.Speech Communication,52(10):p.834-846.
[15]Katsamanis,A.,Papandreou,G.and Maragos,P.2009.Face active appearance modeling and speech acoustic information to recover articulation.IEEE Trans.Acoust.,Speech,and Language Processing,17(3):p.411-422.
[16]Dusan,S.2000.Statistical estimation of articulatory trajectories from the speech signal using dynamical and phonological constraints,in Dept.of Electrical and Computer Engineering.University of Waterloo.
[17]Toutios,A.and Margaritis,K.2008.Contribution to statistical acoustic-to-EMA mapping,in InterSpeech2008.
[18]Toda,T.,Black,A.W.and Tokuda,K.2008.Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model.Speech Communication,50:p.215-227.
[19]Hogden,J.,et al.1996.Accurate recovery of articulator positions from acoustics:New conclusions based on human data.J.Acoust.Soc.Am.,100:p.1819-1834.
[20]Papcun,G.,et al.1992.Inferring articulation and recognising gestures from acoustics with a neural network trained on X-ray microbeam data.J.Acoust.Soc.Am.,92(2):p.688-700.
[21]Richmond,K.2002.Estimating articulatory parameters from the acoustic speech signal.University of Edinburgh.
[22]Uría,B.,et al.2012.Deep Architectures for Articulatory Inversion,in InterSpeech2012.
[23]Wu,Z.,et al.2015.Acoustic to articulatory mapping with deep neural network.Multi med Tools Appl,74(22):p.9889-9907.
[24]Richmond,K.2007.Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion,in Lecture Notes in Computer Science,M.Chetouani,et al.,Editors.Springer-Verlag Berlin.p.263-272.
[25]Toutios,A.and Ouni,S.2011.Predicting tongue positions from acoustics and facial features,in InterSpeech2011.p.2661-2664.
[26]Ioffe,S.and Szegedy,C.2015.Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shift,in the 32nd International Conference on Machine Learning.
[27]Richmond,K.2006.A trajectory mixture density network for the acoustic-articulatory inversion mapping,in InterSpeech2006.p.577-580.
FANG Qiang PhD He is an associate professor of Institute of Linguistics,Chinese Academy of Social Sciences.His research interests include speech production and modeling.
E-mail:fangqiang@cass.org.cn
方强 中国社会科学院语言研究所,副研究员研究兴趣包括言语产生机理研究、发音建模研究、言语工程等
E-mail:fangqiang@cass.org.cn