Identification of speech transients using variable frame rate analysis and wavelet packets.

Daniel M Rasetshwane, J Robert Boston, Ching-Chung Li
{"title":"Identification of speech transients using variable frame rate analysis and wavelet packets.","authors":"Daniel M Rasetshwane,&nbsp;J Robert Boston,&nbsp;Ching-Chung Li","doi":"10.1109/IEMBS.2006.260720","DOIUrl":null,"url":null,"abstract":"<p><p>Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.</p>","PeriodicalId":72689,"journal":{"name":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","volume":" ","pages":"1727-30"},"PeriodicalIF":0.0000,"publicationDate":"2006-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/IEMBS.2006.260720","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMBS.2006.260720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用可变帧率分析和小波包识别语音瞬态。
语音瞬态是识别和区分语音的重要线索。Yoo等人和Tantibundhit等人成功地识别了语音瞬态,并强调了它们,提高了语音在噪声中的可理解性。然而,他们的方法计算量大,不适合实时应用。本文提出了一种将小波包变换的子带分解与变帧率分析和不发音辅音检测相结合的语音瞬态识别和强调方法。VFR分析应用于每个小波包来定义传递函数,该传递函数描述了该包的小波系数变化的程度。不发音辅音检测用于识别不发音辅音音程,并在这些音程中放大及物功能。小波系数乘以该包的传递函数,放大在它们变化时局部化的系数,并在它们稳定时衰减系数。对改进后的小波包系数进行反变换,得到的信号对应的语音瞬态与Yoo等人、Tantibundhit等人识别的瞬态相似。该算法的初步实现运行效率更高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
期刊最新文献
Rapid Label-free DNA Quantification by Multi-frequency Impedance Sensing on a Chip. A Comparison of 1-D and 2-D Deep Convolutional Neural Networks in ECG Classification Brain Morphometry Analysis with Surface Foliation Theory Low-Cost, USB Connected and Multi-Purpose Biopotential Recording System. A Fast Respiratory Rate Estimation Method using Joint Sparse Signal Reconstruction based on Regularized Sparsity Adaptive Matching Pursuit.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1