Design of a Voice Activity Detection Algorithm based on Logarithmic Signal Energy

S. Ozaydin
{"title":"Design of a Voice Activity Detection Algorithm based on Logarithmic Signal Energy","authors":"S. Ozaydin","doi":"10.1109/ICECTA57148.2022.9990492","DOIUrl":null,"url":null,"abstract":"This article presents a new method for calculating the signal energies of speech segments in voice activity detection algorithms. In the study, the μ-law signal compression method is adapted to calculate short-term signal energies. A simple voice activity detection (VAD) algorithm is designed to demonstrate the effectiveness of the proposed method. The same VAD algorithm was also run with two different conventional energy calculation formulas and the performance of each VAD was evaluated using time-domain short-time energy features. The G729 standard VAD algorithm was also used for performance comparison. During the test of the analyzed detectors, many kinds of input speech signals with various types of background environmental noise, such as restaurants, vehicles, and streets, were tested. Using the new energy calculation method, the VAD detector has improved detection accuracy compared to VAD detectors based on the other two energy methods and was able to effectively identify voice-active regions even in noisy conditions at low SNR levels. The results revealed that the VAD detector designed with the proposed new energy calculation formula outperforms traditional energy-based voice activity detection methods and provides noticeable increases in detection rate even under adverse conditions.","PeriodicalId":337798,"journal":{"name":"2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTA57148.2022.9990492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This article presents a new method for calculating the signal energies of speech segments in voice activity detection algorithms. In the study, the μ-law signal compression method is adapted to calculate short-term signal energies. A simple voice activity detection (VAD) algorithm is designed to demonstrate the effectiveness of the proposed method. The same VAD algorithm was also run with two different conventional energy calculation formulas and the performance of each VAD was evaluated using time-domain short-time energy features. The G729 standard VAD algorithm was also used for performance comparison. During the test of the analyzed detectors, many kinds of input speech signals with various types of background environmental noise, such as restaurants, vehicles, and streets, were tested. Using the new energy calculation method, the VAD detector has improved detection accuracy compared to VAD detectors based on the other two energy methods and was able to effectively identify voice-active regions even in noisy conditions at low SNR levels. The results revealed that the VAD detector designed with the proposed new energy calculation formula outperforms traditional energy-based voice activity detection methods and provides noticeable increases in detection rate even under adverse conditions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于对数信号能量的语音活动检测算法设计
在语音活动检测算法中,提出了一种计算语音片段信号能量的新方法。在研究中,采用μ律信号压缩法计算短时信号能量。设计了一个简单的语音活动检测(VAD)算法来验证该方法的有效性。采用两种不同的常规能量计算公式运行相同的VAD算法,并利用时域短时间能量特征对每个VAD的性能进行评价。采用G729标准VAD算法进行性能比较。在对所分析的检测器的测试过程中,测试了多种具有不同背景环境噪声的输入语音信号,如餐馆、车辆、街道等。与基于其他两种能量方法的VAD检测器相比,采用新的能量计算方法的VAD检测器提高了检测精度,即使在低信噪比的噪声条件下也能有效地识别声源区域。结果表明,采用新能量计算公式设计的VAD检测器优于传统的基于能量的语音活动检测方法,即使在不利条件下也能显著提高检测率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Centroid-Based Clustering Using Sentential Embedding Similarity Measure Moss-Based Biotechnological Air Purification Control System Studying the Effect of Face Masks in Identifying Speakers using LSTM Mental Stress Analysis using the Power Spectrum of fNIRS Signals RF LNA with Simultaneous Noise-Cancellation and Distortion-Cancellation for Wireless RF Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1