{"title":"关于声道长度归一化等于背侧空间线性变换的评述","authors":"M. Afify, O. Siohan","doi":"10.1109/TASL.2007.896653","DOIUrl":null,"url":null,"abstract":"The bilinear transformation (BT) is used for vocal tract length normalization (VTLN) in speech recogniton systems. We prove two properties of the bilinear mapping that motivated the band-diagonal transform proposed in M. Afify and O. Siohan, (ldquoConstrained maximum likelihood linear regression for speaker adaptation,rdquo in Proc. ICSLP, Beijing, China, Oct. 2000.) This is in contrast to what is stated in M. Pitz and H. Ney, (ldquoVocal tract length normalization equals linear transformation in cepstral space,rdquo IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp 930-944, September 2005) that the transform of Afify and Siohan was motivated by empirical observations.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"23 1","pages":"1731-1732"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Comments on Vocal Tract Length Normalization Equals Linear Transformation in Cepstral Space\",\"authors\":\"M. Afify, O. Siohan\",\"doi\":\"10.1109/TASL.2007.896653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The bilinear transformation (BT) is used for vocal tract length normalization (VTLN) in speech recogniton systems. We prove two properties of the bilinear mapping that motivated the band-diagonal transform proposed in M. Afify and O. Siohan, (ldquoConstrained maximum likelihood linear regression for speaker adaptation,rdquo in Proc. ICSLP, Beijing, China, Oct. 2000.) This is in contrast to what is stated in M. Pitz and H. Ney, (ldquoVocal tract length normalization equals linear transformation in cepstral space,rdquo IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp 930-944, September 2005) that the transform of Afify and Siohan was motivated by empirical observations.\",\"PeriodicalId\":13155,\"journal\":{\"name\":\"IEEE Trans. Speech Audio Process.\",\"volume\":\"23 1\",\"pages\":\"1731-1732\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Trans. Speech Audio Process.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TASL.2007.896653\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Trans. Speech Audio Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2007.896653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
摘要
在语音识别系统中,双线性变换(BT)被用于声道长度归一化。我们证明了M. Afify和O. Siohan提出的双线性映射的两个性质,这两个性质驱动了带对角变换(ldq -约束的最大似然线性回归用于讲话人自适应,rdquo . Proc. ICSLP,北京,中国,2000年10月)。这与M. Pitz和H. Ney在《声道长度归一化等于倒谱空间的线性变换》中所陈述的相反,见《IEEE语音与音频处理汇刊》第13卷第1期。5, 930-944页,2005年9月),Afify和Siohan的转变是由实证观察推动的。
Comments on Vocal Tract Length Normalization Equals Linear Transformation in Cepstral Space
The bilinear transformation (BT) is used for vocal tract length normalization (VTLN) in speech recogniton systems. We prove two properties of the bilinear mapping that motivated the band-diagonal transform proposed in M. Afify and O. Siohan, (ldquoConstrained maximum likelihood linear regression for speaker adaptation,rdquo in Proc. ICSLP, Beijing, China, Oct. 2000.) This is in contrast to what is stated in M. Pitz and H. Ney, (ldquoVocal tract length normalization equals linear transformation in cepstral space,rdquo IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp 930-944, September 2005) that the transform of Afify and Siohan was motivated by empirical observations.