7.REFERENCES_中国语音学报（第11辑）-QQ阅读轻小说男生网

书名：中国语音学报（第11辑）
作者名：中国社会科学院语言研究所主办
本章字数：685字
更新时间：2025-03-31 07:31:29

7.REFERENCES

[1]King，S.，et al.2007.Speech production knowledge in automatic speech recognition.J.Acoust.Soc.Am.，121（2）：p.723-742.

[2]Badin，P.，et al.2010.Can you ‘read’ tongue movements？ Evaluation of the contribution of tongue display to speech understanding.Speech Communication，52（6）：p.493-503.

[3]Youssef，A.B.，et al.2011.Towards a multi-speaker visual articulatory feedback system，in InterSpeech2011.p.589-592.

[4]Lewis，J.1991.Automated lip-sync：Background and techniques.The Journal of Visualization and Computer Animation，2（4）：p.118-122.

[5]Bregler，C.，Covell，M.and Slaney，M.1997.Video rewrite：Driving visual speech with audio，in the 24th annual conference on Computer graphics and interactive techniques.p.353-360.

[6]Hofer，G.and Richmond，K.2010.Comparison of HMM and TMDN methods for lip synchronisation，in InterSpeech2010.

[7]Schroeter，J.and Sondhi，M.M.1992.Speech coding based on physiological models of speech production，in Advances in Speech Signal Processing，S.Furui and M.M.Sondhi，Editors.Marcel Dekker Inc.：New York.p.231-268.

[8]Atal，B.S.，et al.1978.Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique.J.Acoust.Soc.Am.，63（5）：p.1535-1555.

[9]Ouni，S.and Laprie，Y.2005 Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.J.Acoust.Soc.Am.，118（1）：p.444-460.

[10]Schroeter，J.and Sondhi，M.M.1994.Techniques for estimating vocal-tract shapes from the speech signal.IEEE Trans.Speech Audio Processing，2：p.133-150.

[11]Roweis，S.，Data driven production models for speech processing.1999，California Institute of Technology.

[12]Hiroya，S.and Honda，M.2004.Estimation of Articulatory Movements from Speech Acoustics Using an HMM-Based Speech Production Model.IEEE Transactions on Speech and Audio Processing，12（2）：p.175-185.

[13]Zhang，L.and Renals，S.2008.Acoustic-articulatory modeling with the trajectory HMM.IEEE Signal Process Letter，15：p.245-248.

[14]Ling，Z.，Richmond，K.and Yamagishi，J.2010.An Analysis of HMM-based prediction of articulatory movements.Speech Communication，52（10）：p.834-846.

[15]Katsamanis，A.，Papandreou，G.and Maragos，P.2009.Face active appearance modeling and speech acoustic information to recover articulation.IEEE Trans.Acoust.，Speech，and Language Processing，17（3）：p.411-422.

[16]Dusan，S.2000.Statistical estimation of articulatory trajectories from the speech signal using dynamical and phonological constraints，in Dept.of Electrical and Computer Engineering.University of Waterloo.

[17]Toutios，A.and Margaritis，K.2008.Contribution to statistical acoustic-to-EMA mapping，in InterSpeech2008.

[18]Toda，T.，Black，A.W.and Tokuda，K.2008.Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model.Speech Communication，50：p.215-227.

[19]Hogden，J.，et al.1996.Accurate recovery of articulator positions from acoustics：New conclusions based on human data.J.Acoust.Soc.Am.，100：p.1819-1834.

[20]Papcun，G.，et al.1992.Inferring articulation and recognising gestures from acoustics with a neural network trained on X-ray microbeam data.J.Acoust.Soc.Am.，92（2）：p.688-700.

[21]Richmond，K.2002.Estimating articulatory parameters from the acoustic speech signal.University of Edinburgh.

[22]Uría，B.，et al.2012.Deep Architectures for Articulatory Inversion，in InterSpeech2012.

[23]Wu，Z.，et al.2015.Acoustic to articulatory mapping with deep neural network.Multi med Tools Appl，74（22）：p.9889-9907.

[24]Richmond，K.2007.Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion，in Lecture Notes in Computer Science，M.Chetouani，et al.，Editors.Springer-Verlag Berlin.p.263-272.

[25]Toutios，A.and Ouni，S.2011.Predicting tongue positions from acoustics and facial features，in InterSpeech2011.p.2661-2664.

[26]Ioffe，S.and Szegedy，C.2015.Batch Normalization：Accelerating Deep Network Training by Reducing Internal Covariate Shift，in the 32nd International Conference on Machine Learning.

[27]Richmond，K.2006.A trajectory mixture density network for the acoustic-articulatory inversion mapping，in InterSpeech2006.p.577-580.

FANG Qiang PhD He is an associate professor of Institute of Linguistics，Chinese Academy of Social Sciences.His research interests include speech production and modeling.

E-mail：fangqiang@cass.org.cn

方强中国社会科学院语言研究所，副研究员研究兴趣包括言语产生机理研究、发音建模研究、言语工程等

E-mail：fangqiang@cass.org.cn

本周热推：

全球华语研究文献选编现代联绵字理论负面影响研究汉语方言重叠式比较研究语言理论：语言的描述功能量词理论研究：从蒯因的观点看