Comparison of vocal tract shape estimation techniques based on formant frequencies, autocorrelation, covariance and lattice

Ashwini S. Patil, M. Shah
{"title":"Comparison of vocal tract shape estimation techniques based on formant frequencies, autocorrelation, covariance and lattice","authors":"Ashwini S. Patil, M. Shah","doi":"10.1109/ICNTE.2015.7029934","DOIUrl":null,"url":null,"abstract":"Vocal tract is one of most important system in speech production and it begins at the glottis and ends at the lips. Vocal tract shape (VTS) is defined as varying cross sectional area from glottis-to-lips. Based on literature review it is noted that most of the research work carried out on vocal tract shape estimation (VTSE) is based on Wakita's algorithm which is based on autocorrelation of speech. The objective of this research work is to investigate VTSE based on formant frequencies, autocorrelation, covariance and lattice methods. For validation of results, data available for vocal tract shape for vowels from Magnetic Resonance Imaging (MRI) technique was used. Vowels /a/, /i/, /u/, /o/, vowel-semivowel-vowel utterances /aya/, /awa/ and some VCV syllables /apa/, /uba/ were analyzed for three female and three male speakers. From formant frequency, autocorrelation, covariance and lattice methods satisfactory results were obtained for vowels and semivowels. However, VTS for vowels based on formant frequency technique when compared with the MRI shapes were more realistic. From the investigation for effect of variation in analysis frame length on VTSE, it was observed that, lattice method required minimum analysis frame length compared to autocorrelation, and covariance methods, and estimated areas were more consistent across the analysis frames compared to other methods.","PeriodicalId":186188,"journal":{"name":"2015 International Conference on Nascent Technologies in the Engineering Field (ICNTE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Nascent Technologies in the Engineering Field (ICNTE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNTE.2015.7029934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Vocal tract is one of most important system in speech production and it begins at the glottis and ends at the lips. Vocal tract shape (VTS) is defined as varying cross sectional area from glottis-to-lips. Based on literature review it is noted that most of the research work carried out on vocal tract shape estimation (VTSE) is based on Wakita's algorithm which is based on autocorrelation of speech. The objective of this research work is to investigate VTSE based on formant frequencies, autocorrelation, covariance and lattice methods. For validation of results, data available for vocal tract shape for vowels from Magnetic Resonance Imaging (MRI) technique was used. Vowels /a/, /i/, /u/, /o/, vowel-semivowel-vowel utterances /aya/, /awa/ and some VCV syllables /apa/, /uba/ were analyzed for three female and three male speakers. From formant frequency, autocorrelation, covariance and lattice methods satisfactory results were obtained for vowels and semivowels. However, VTS for vowels based on formant frequency technique when compared with the MRI shapes were more realistic. From the investigation for effect of variation in analysis frame length on VTSE, it was observed that, lattice method required minimum analysis frame length compared to autocorrelation, and covariance methods, and estimated areas were more consistent across the analysis frames compared to other methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于共振峰频率、自相关、协方差和晶格的声道形状估计技术的比较
声道是语音产生过程中最重要的系统之一,它起于声门,止于嘴唇。声道形状(VTS)被定义为从声门到嘴唇的不同横截面积。在文献综述中,我们注意到大多数关于声道形状估计的研究工作都是基于基于语音自相关的Wakita算法。本研究的目的是研究基于形成峰频率、自相关、协方差和晶格方法的VTSE。为了验证结果,使用磁共振成像(MRI)技术获得的元音声道形状数据。分析了3名女性和3名男性说话者的元音/a/、/i/、/u/、/o/、元音-半元音-元音发音/aya/、/awa/和部分VCV音节/apa/、/uba/。从形成峰频率、自相关、协方差和点阵方法对元音和半元音进行了分析,得到了满意的结果。然而,基于形成峰频率技术的元音VTS与MRI形状相比更为真实。从分析帧长变化对VTSE影响的研究中可以看出,与自相关法和协方差法相比,格点法所需的分析帧长最小,且估算区域在分析帧间的一致性优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
OBESTDD: Ontology Based Expert System for Thyroid Disease Diagnosis Design & development of microcontroller based programmable ramp generator for AC-DC converter for simulating decay power transient in experimental facility for nuclear power plants Investigation of effect of surface failures on inner and outer race of bearing on vibration spectrum Electronic bio-chemistry analyzer for estimation of biochemical constituents of blood Development of prototype for waste heat energy recovery from thermoelectric system at Godrej vikhroli plant
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1