Identification of speech transients using variable frame rate analysis and wavelet packets.

Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference Pub Date : 2006-01-01 DOI:10.1109/IEMBS.2006.260720

Daniel M Rasetshwane, J Robert Boston, Ching-Chung Li

{"title":"Identification of speech transients using variable frame rate analysis and wavelet packets.","authors":"Daniel M Rasetshwane, J Robert Boston, Ching-Chung Li","doi":"10.1109/IEMBS.2006.260720","DOIUrl":null,"url":null,"abstract":"<p><p>Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.</p>","PeriodicalId":72689,"journal":{"name":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","volume":" ","pages":"1727-30"},"PeriodicalIF":0.0000,"publicationDate":"2006-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/IEMBS.2006.260720","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMBS.2006.260720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用可变帧率分析和小波包识别语音瞬态。

语音瞬态是识别和区分语音的重要线索。Yoo等人和Tantibundhit等人成功地识别了语音瞬态，并强调了它们，提高了语音在噪声中的可理解性。然而，他们的方法计算量大，不适合实时应用。本文提出了一种将小波包变换的子带分解与变帧率分析和不发音辅音检测相结合的语音瞬态识别和强调方法。VFR分析应用于每个小波包来定义传递函数，该传递函数描述了该包的小波系数变化的程度。不发音辅音检测用于识别不发音辅音音程，并在这些音程中放大及物功能。小波系数乘以该包的传递函数，放大在它们变化时局部化的系数，并在它们稳定时衰减系数。对改进后的小波包系数进行反变换，得到的信号对应的语音瞬态与Yoo等人、Tantibundhit等人识别的瞬态相似。该算法的初步实现运行效率更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference

CiteScore

2.20

自引率

0.00%

发文量