A diagonal plus low-rank covariance model for computationally efficient source separation

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI:10.1109/MLSP.2017.8168169

A. Liutkus, Kazuyoshi Yoshii

{"title":"A diagonal plus low-rank covariance model for computationally efficient source separation","authors":"A. Liutkus, Kazuyoshi Yoshii","doi":"10.1109/MLSP.2017.8168169","DOIUrl":null,"url":null,"abstract":"This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"34 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP.2017.8168169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一个对角线加低秩协方差模型计算有效的源分离

提出了一种加速版的正半定张量分解法(PSDTF)用于盲源分离。PSDTF的工作优于非负矩阵分解(NMF)，因为它放弃了音频信号可以通过短期傅里叶变换(STFT)在频域白化的假设。事实上，这个假设只在一种理想情况下成立，即每一帧都是无限长的，目标信号在每一帧中都是完全静止的。因此，PSDTF处理频率箱上的完整协方差矩阵，而不是像NMF那样强迫它们是对角的。尽管PSDTF在分离性能上明显优于NMF，但由于大协方差矩阵的重复反演，它的计算成本很高。为了解决这一问题，我们提出了一种基于对角加低秩协方差矩阵的中间模型，并推导了有效更新PSDTF参数的期望最大化(EM)算法。实验结果表明，我们的方法可以在不显著降低分离性能的情况下，将PSDTF的复杂度显著降低几个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

自引率

0.00%

发文量