Binary mask estimation for voiced speech segregation using Bayesian method

Shan Liang, Wenju Liu
{"title":"Binary mask estimation for voiced speech segregation using Bayesian method","authors":"Shan Liang, Wenju Liu","doi":"10.1109/ACPR.2011.6305053","DOIUrl":null,"url":null,"abstract":"The ideal binary mask (IBM) estimation has been set as the computational goal of Computational auditory scene analysis (CASA). A lot of effort has been made in the IBM estimation via statistical learning method. The current Bayesian methods usually estimate the mask value of each time-frequency (T-F) unit independently with only local auditory features. In this paper, we propose a new Bayesian approach. First, a set of pitch-based auditory features are summarized to exploit the inherent characteristics of the reliable and unreliable time-frequency (T-F) units. A rough estimation is obtained according to Maximum Likelihood (ML) rule. Then, we propose a prior model which is derived from onset/offset segmentation to improve the estimation. Finally, an efficient Markov Chain Monte Carlo (MCMC) procedure is applied to approach the maximum a posterior (MAP) estimation. Proposed method is evaluated on Cooke's 100 mixtures and compared with previous model. Experiments show that our method performs better.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The First Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2011.6305053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The ideal binary mask (IBM) estimation has been set as the computational goal of Computational auditory scene analysis (CASA). A lot of effort has been made in the IBM estimation via statistical learning method. The current Bayesian methods usually estimate the mask value of each time-frequency (T-F) unit independently with only local auditory features. In this paper, we propose a new Bayesian approach. First, a set of pitch-based auditory features are summarized to exploit the inherent characteristics of the reliable and unreliable time-frequency (T-F) units. A rough estimation is obtained according to Maximum Likelihood (ML) rule. Then, we propose a prior model which is derived from onset/offset segmentation to improve the estimation. Finally, an efficient Markov Chain Monte Carlo (MCMC) procedure is applied to approach the maximum a posterior (MAP) estimation. Proposed method is evaluated on Cooke's 100 mixtures and compared with previous model. Experiments show that our method performs better.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于贝叶斯方法的语音分离二值掩码估计
将理想二值掩码估计作为计算听觉场景分析(CASA)的计算目标。IBM通过统计学习方法进行了大量的研究。目前的贝叶斯方法通常只使用局部听觉特征独立估计每个时频单元的掩模值。本文提出了一种新的贝叶斯方法。首先,总结了一组基于音高的听觉特征,以挖掘可靠和不可靠时频(T-F)单元的固有特征。根据最大似然(ML)规则得到一个粗略的估计。然后,我们提出了一种基于起始/偏移分割的先验模型来改进估计。最后,应用一种有效的马尔可夫链蒙特卡罗(MCMC)方法逼近最大后验估计(MAP)。在Cooke's 100混合物上对该方法进行了评价,并与已有模型进行了比较。实验表明,该方法具有较好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Clinical significance of copy number variants involving KANK1 in patients with neurodevelopmental disorders
IF 1.9 4区 医学European journal of medical geneticsPub Date : 2019-01-01 DOI: 10.1016/j.ejmg.2018.04.012
Rena J. Vanzo , Hope Twede , Karen S. Ho , Aparna Prasad , Megan M. Martin , Sarah T. South , E. Robert Wassman
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Geolocation based image annotation Discriminant appearance weighting for action recognition Tree crown detection in high resolution optical images during the early growth stages of Eucalyptus plantations in Brazil Designing and selecting features for MR image segmentation Adaptive Patch Alignment Based Local Binary Patterns for face recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1