基于模型的无线声传感器网络语音活动检测

Yingke Zhao, J. Nielsen, M. G. Christensen, Jinzdona Chen
{"title":"基于模型的无线声传感器网络语音活动检测","authors":"Yingke Zhao, J. Nielsen, M. G. Christensen, Jinzdona Chen","doi":"10.23919/EUSIPCO.2018.8553457","DOIUrl":null,"url":null,"abstract":"One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.","PeriodicalId":303069,"journal":{"name":"2018 26th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-Based Voice Activity Detection in Wireless Acoustic Sensor Networks\",\"authors\":\"Yingke Zhao, J. Nielsen, M. G. Christensen, Jinzdona Chen\",\"doi\":\"10.23919/EUSIPCO.2018.8553457\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.\",\"PeriodicalId\":303069,\"journal\":{\"name\":\"2018 26th European Signal Processing Conference (EUSIPCO)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 26th European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/EUSIPCO.2018.8553457\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 26th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/EUSIPCO.2018.8553457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于无线声传感器网络的语音增强面临的主要挑战之一是鲁棒性和准确性语音活动检测。VAD广泛应用于语音增强、语音编码、语音识别等领域。在语音增强应用中,VAD起着重要的作用,因为噪声统计可以在非语音帧中更新,以确保有效的降噪和可容忍的语音失真。尽管在单通道VAD方面已经做出了巨大的努力,但在多通道情况下,特别是在无线局域网中,几乎没有找到解决方案。本文提出了一种基于模型的噪声功率谱密度(PSD)估计的分布式VAD。首先对网络中每个节点的语音PSD和噪声PSD进行估计,然后利用广义似然比检验(GLRT)进行分布式检测。所提出的基于全局GLRT的VAD具有相当一般的形式。实际上,我们可以通过观察当前的时间帧和频带,或者考虑相邻的帧和频带,来判断语音是否存在。最后,采用随机八卦等分布式一致性方法,即整个检测系统不需要任何融合中心,得到分布式GLRT结果。采用基于模型的噪声估计方法,使分布式VAD在非平稳噪声条件下具有鲁棒性。实验结果表明,该方法在检测精度上优于传统的多通道VAD方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Model-Based Voice Activity Detection in Wireless Acoustic Sensor Networks
One of the major challenges in wireless acoustic sensor networks (WASN) based speech enhancement is robust and accurate voice activity detection (VAD). VAD is widely used in speech enhancement, speech coding, speech recognition, etc. In speech enhancement applications, VAD plays an important role, since noise statistics can be updated during non-speech frames to ensure efficient noise reduction and tolerable speech distortion. Although significant efforts have been made in single channel VAD, few solutions can be found in the multichannel case, especially in WASN. In this paper, we introduce a distributed VAD by using model-based noise power spectral density (PSD) estimation. For each node in the network, the speech PSD and noise PSD are first estimated, then a distributed detection is made by applying the generalized likelihood ratio test (GLRT). The proposed global GLRT based VAD has a quite general form. Indeed, we can judge whether the speech is present or absent by using the current time frame and frequency band observation or by taking into account the neighbouring frames and bands. Finally, the distributed GLRT result is obtained by using a distributed consensus method, such as random gossip, i.e., the whole detection system does not need any fusion center. With the model-based noise estimation method, the proposed distributed VAD performs robustly under non-stationary noise conditions, such as babble noise. As shown in experiments, the proposed method outperforms traditional multichannel VAD methods in terms of detection accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Missing Sample Estimation Based on High-Order Sparse Linear Prediction for Audio Signals Multi-Shot Single Sensor Light Field Camera Using a Color Coded Mask Knowledge-Aided Normalized Iterative Hard Thresholding Algorithms for Sparse Recovery Two-Step Hybrid Multiuser Equalizer for Sub-Connected mmWave Massive MIMO SC-FDMA Systems How Much Will Tiny IoT Nodes Profit from Massive Base Station Arrays?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1