首页 > 最新文献

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

英文 中文
Unsupervised domain adaptation with copula models copula模型的无监督域自适应
Cuong D. Tran, Ognjen Rudovic, V. Pavlovic
We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family; (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.
我们研究了无监督域自适应任务,即在训练过程中不提供目标域的标记数据。为了处理源分布和目标分布在特征和标签上的潜在差异,我们利用了基于copula的回归框架。这种方法的好处是双重的:(a)它允许我们在普通指数族之外建立更大范围的条件预测密度模型;(b)我们展示了如何利用Sklar定理,即联结密度与联结依赖函数相关的联结公式的本质,来找到减轻域不匹配的有效特征映射。通过将数据转换到一个copula域,我们展示了一些基准数据集(包括人类情感估计),并使用不同的回归模型进行预测,与最近提出的特征转换(自适应)方法相比,我们可以实现更鲁棒和准确的目标标签估计。
{"title":"Unsupervised domain adaptation with copula models","authors":"Cuong D. Tran, Ognjen Rudovic, V. Pavlovic","doi":"10.1109/MLSP.2017.8168131","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168131","url":null,"abstract":"We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family; (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"64 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76760070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Asymptotic performance of regularized quadratic discriminant analysis based classifiers 正则化二次判别分析分类器的渐近性能
Khalil Elkhalil, A. Kammoun, Romain Couillet, T. Al-Naffouri, Mohamed-Slim Alouini
This paper carries out a large dimensional analysis of the standard regularized quadratic discriminant analysis (QDA) classifier designed on the assumption that data arise from a Gaussian mixture model. The analysis relies on fundamental results from random matrix theory (RMT) when both the number of features and the cardinality of the training data within each class grow large at the same pace. Under some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that depends only on the covariances and means associated with each class as well as the problem dimensions. Such a result permits a better understanding of the performance of regularized QDA and can be used to determine the optimal regularization parameter that minimizes the misclassification error probability. Despite being valid only for Gaussian data, our theoretical findings are shown to yield a high accuracy in predicting the performances achieved with real data sets drawn from popular real data bases, thereby making an interesting connection between theory and practice.
本文在假设数据来自高斯混合模型的前提下,对标准正则化二次判别分析(QDA)分类器进行了大量纲分析。当每个类中的特征数量和训练数据的基数以相同的速度增长时,分析依赖于随机矩阵理论(RMT)的基本结果。在一些温和的假设下,我们证明渐近分类误差收敛到一个确定性量,该量仅取决于与每个类相关的协方差和均值以及问题维度。这样的结果允许更好地理解正则化QDA的性能,并可用于确定最小化误分类错误概率的最优正则化参数。尽管仅对高斯数据有效,但我们的理论发现在预测从流行的真实数据库中提取的真实数据集所取得的性能时显示出很高的准确性,从而在理论与实践之间建立了有趣的联系。
{"title":"Asymptotic performance of regularized quadratic discriminant analysis based classifiers","authors":"Khalil Elkhalil, A. Kammoun, Romain Couillet, T. Al-Naffouri, Mohamed-Slim Alouini","doi":"10.1109/MLSP.2017.8168172","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168172","url":null,"abstract":"This paper carries out a large dimensional analysis of the standard regularized quadratic discriminant analysis (QDA) classifier designed on the assumption that data arise from a Gaussian mixture model. The analysis relies on fundamental results from random matrix theory (RMT) when both the number of features and the cardinality of the training data within each class grow large at the same pace. Under some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that depends only on the covariances and means associated with each class as well as the problem dimensions. Such a result permits a better understanding of the performance of regularized QDA and can be used to determine the optimal regularization parameter that minimizes the misclassification error probability. Despite being valid only for Gaussian data, our theoretical findings are shown to yield a high accuracy in predicting the performances achieved with real data sets drawn from popular real data bases, thereby making an interesting connection between theory and practice.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"8 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79244806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Faster R-CNN with densenet for scale aware pedestrian detection vis-à-vis hard negative suppression 基于dendenet的快速R-CNN尺度感知行人检测与-à-vis硬负抑制
Suman Kumar Choudhury, R. P. Padhy, P. K. Sa
This paper presents a fully convolutional architecture for pedestrian detection. The DenseNet model is incorporated in the Faster R-CNN framework to extract the deep convolutional features. A two-phase approach is suggested to minimize the false positives owing to hard negative backgrounds. Feature maps from multiple intermediate layers are taken into consideration to facilitate small-scale detection. The proposed method alongside few competent schemes are compared on two benchmark datasets. The obtained results demonstrate the potential of our approach in addressing the real world challenges.
本文提出了一种用于行人检测的全卷积结构。DenseNet模型被整合到Faster R-CNN框架中以提取深度卷积特征。建议采用两阶段方法,以尽量减少由于硬阴性背景造成的误报。考虑了多个中间层的特征映射,便于小尺度检测。在两个基准数据集上比较了所提出的方法和几种有效方案。获得的结果证明了我们的方法在解决现实世界挑战方面的潜力。
{"title":"Faster R-CNN with densenet for scale aware pedestrian detection vis-à-vis hard negative suppression","authors":"Suman Kumar Choudhury, R. P. Padhy, P. K. Sa","doi":"10.1109/MLSP.2017.8168128","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168128","url":null,"abstract":"This paper presents a fully convolutional architecture for pedestrian detection. The DenseNet model is incorporated in the Faster R-CNN framework to extract the deep convolutional features. A two-phase approach is suggested to minimize the false positives owing to hard negative backgrounds. Feature maps from multiple intermediate layers are taken into consideration to facilitate small-scale detection. The proposed method alongside few competent schemes are compared on two benchmark datasets. The obtained results demonstrate the potential of our approach in addressing the real world challenges.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"88 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80273796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Identification of a thermal building model by learning the dynamics of the solar flux 通过学习太阳通量动力学来确定热建筑模型
Tahar Nabil, F. Roueff, J. Jicquel, A. Girard
This article deals with the identification of a dynamic building model from on-site input-output records. In practice, the solar gains, a key input, are often unobserved due to the cost of the associated sensor. We suggest here to replace this sensor by a cheap outdoor temperature sensor, exposed to the sun. Our assumption is that the temperature bias between this sensor and a second sheltered sensor is an indirect observation of the solar flux. We derive a novel state-space model for the outdoor temperature bias, with sudden changes in the weather conditions accounted for by occasional high variance increments of the hidden state. The magnitude of the high values and the times at which they occur are estimated with an ℓ1-regularized maximum likelihood approach. Finally, this model is appended to a thermal building model based on an equivalent RC network, forming a conditionally linear Gaussian state-space system. We apply the Expectation-Maximization algorithm with Rao-Blackwellised particle smoothing in order to learn the thermal model. We are able, despite the indirect observation of the solar flux, to correctly estimate the physical parameters of the building, in particular the static coefficients and the fast time constant.
本文讨论了从现场输入输出记录中识别动态建筑模型的问题。在实践中,由于相关传感器的成本,作为关键输入的太阳能增益通常无法观察到。我们建议在这里用一个便宜的室外温度传感器代替这个传感器,暴露在阳光下。我们的假设是,这个传感器和第二个屏蔽传感器之间的温度偏差是对太阳通量的间接观测。我们推导了一种新的室外温度偏差的状态空间模型,其中隐藏状态的偶尔高方差增量可以解释天气条件的突然变化。高值的大小和它们出现的时间是用1-正则化的最大似然方法估计的。最后,将该模型附加到基于等效RC网络的热建筑模型中,形成一个条件线性高斯状态空间系统。我们采用期望最大化算法与rao - blackwell化粒子平滑来学习热模型。我们能够,尽管太阳通量的间接观测,正确估计建筑物的物理参数,特别是静态系数和快速时间常数。
{"title":"Identification of a thermal building model by learning the dynamics of the solar flux","authors":"Tahar Nabil, F. Roueff, J. Jicquel, A. Girard","doi":"10.1109/MLSP.2017.8168112","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168112","url":null,"abstract":"This article deals with the identification of a dynamic building model from on-site input-output records. In practice, the solar gains, a key input, are often unobserved due to the cost of the associated sensor. We suggest here to replace this sensor by a cheap outdoor temperature sensor, exposed to the sun. Our assumption is that the temperature bias between this sensor and a second sheltered sensor is an indirect observation of the solar flux. We derive a novel state-space model for the outdoor temperature bias, with sudden changes in the weather conditions accounted for by occasional high variance increments of the hidden state. The magnitude of the high values and the times at which they occur are estimated with an ℓ1-regularized maximum likelihood approach. Finally, this model is appended to a thermal building model based on an equivalent RC network, forming a conditionally linear Gaussian state-space system. We apply the Expectation-Maximization algorithm with Rao-Blackwellised particle smoothing in order to learn the thermal model. We are able, despite the indirect observation of the solar flux, to correctly estimate the physical parameters of the building, in particular the static coefficients and the fast time constant.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"16 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81021467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How can we detect anomalies from subsampled audio signals? 我们如何从次采样音频信号中检测异常?
Y. Kawaguchi, Takashi Endo
We aim to reduce the cost of sound monitoring for maintain machinery by reducing the sampling rate, i.e., sub-Nyquist sampling. Monitoring based on sub-Nyquist sampling requires two sub-systems: a sub-system on-site for sampling machinery sounds at a low rate and a sub-system off-site for detecting anomalies from the subsampled signal. This paper proposes a method for achieving both subsystems. First, the proposed method uses non-uniform sampling to encode higher than the Nyquist frequency. Second, the method applies a long short-term memory-(LSTM)-based autoencoder network for detecting anomalies. The novelty of the proposed network is that the subsampled time-domain signal is demultiplexed and received as input in an end-to-end manner, enabling anomaly detection from the subsampled signal. Experimental results indicate that our method is suitable for anomaly detection from the subsampled signal.
我们的目标是通过降低采样率(即次奈奎斯特采样)来降低维护机械的声音监测成本。基于次奈奎斯特采样的监测需要两个子系统:现场的子系统用于以低速率采样机械声音,而非现场的子系统用于从次采样信号中检测异常。本文提出了一种实现这两个子系统的方法。首先,该方法采用非均匀采样对高于奈奎斯特频率的信号进行编码。其次,采用基于长短期记忆(LSTM)的自编码器网络进行异常检测。所提出的网络的新颖之处在于,下采样的时域信号被解复用,并以端到端方式作为输入接收,从而能够从下采样信号中检测异常。实验结果表明,该方法适用于下采样信号的异常检测。
{"title":"How can we detect anomalies from subsampled audio signals?","authors":"Y. Kawaguchi, Takashi Endo","doi":"10.1109/MLSP.2017.8168164","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168164","url":null,"abstract":"We aim to reduce the cost of sound monitoring for maintain machinery by reducing the sampling rate, i.e., sub-Nyquist sampling. Monitoring based on sub-Nyquist sampling requires two sub-systems: a sub-system on-site for sampling machinery sounds at a low rate and a sub-system off-site for detecting anomalies from the subsampled signal. This paper proposes a method for achieving both subsystems. First, the proposed method uses non-uniform sampling to encode higher than the Nyquist frequency. Second, the method applies a long short-term memory-(LSTM)-based autoencoder network for detecting anomalies. The novelty of the proposed network is that the subsampled time-domain signal is demultiplexed and received as input in an end-to-end manner, enabling anomaly detection from the subsampled signal. Experimental results indicate that our method is suitable for anomaly detection from the subsampled signal.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"191 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77755064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Local Gaussian model with source-set constraints in audio source separation 声源分离中带源集约束的局部高斯模型
Rintaro Ikeshita, M. Togami, Y. Kawaguchi, Yusuke Fujita, Kenji Nagamatsu
To improve the performance of blind audio source separation of convolutive mixtures, the local Gaussian model (LGM) having full rank covariance matrices proposed by Duong et al. is extended. The previous model basically assumes that all sources contribute to each time-frequency slot, which may fail to capture the characteristic of signals with many intermittent silent periods. A constraint on source sets that contribute to each time-frequency slot is therefore explicitly introduced. This approach can be regarded as a relaxation of the sparsity constraint in the conventional time-frequency mask. The proposed model is jointly optimized among the original local Gaussian model parameters, the relaxed version of the time-frequency mask, and a permutation alignment, leading to a robust permutation-free algorithm. We also present a novel multi-channel Wiener filter weighted by a relaxed version of the time-frequency mask. Experimental results over noisy speech signals show that the proposed model is effective compared with the original local Gaussian model and is comparable to its extension, the multi-channel nonnegative matrix factorization.
为了提高卷积混合的盲音源分离性能,对Duong等人提出的具有全秩协方差矩阵的局部高斯模型(LGM)进行了扩展。以前的模型基本假设每个时频隙都有所有的源,可能无法捕捉到具有许多间歇静默期的信号的特性。因此,明确地引入了对每个时频间隙的源集的约束。这种方法可以看作是对传统时频掩模的稀疏性约束的一种放松。该模型在原始高斯局部模型参数、松弛版时频掩模和置换对齐的基础上进行了联合优化,实现了鲁棒无置换算法。我们还提出了一种新的多通道维纳滤波器,该滤波器采用了一种放松版的时频掩模加权。在有噪声语音信号上的实验结果表明,该模型与原有的局部高斯模型相比是有效的,并可与其扩展的多通道非负矩阵分解相媲美。
{"title":"Local Gaussian model with source-set constraints in audio source separation","authors":"Rintaro Ikeshita, M. Togami, Y. Kawaguchi, Yusuke Fujita, Kenji Nagamatsu","doi":"10.1109/MLSP.2017.8168170","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168170","url":null,"abstract":"To improve the performance of blind audio source separation of convolutive mixtures, the local Gaussian model (LGM) having full rank covariance matrices proposed by Duong et al. is extended. The previous model basically assumes that all sources contribute to each time-frequency slot, which may fail to capture the characteristic of signals with many intermittent silent periods. A constraint on source sets that contribute to each time-frequency slot is therefore explicitly introduced. This approach can be regarded as a relaxation of the sparsity constraint in the conventional time-frequency mask. The proposed model is jointly optimized among the original local Gaussian model parameters, the relaxed version of the time-frequency mask, and a permutation alignment, leading to a robust permutation-free algorithm. We also present a novel multi-channel Wiener filter weighted by a relaxed version of the time-frequency mask. Experimental results over noisy speech signals show that the proposed model is effective compared with the original local Gaussian model and is comparable to its extension, the multi-channel nonnegative matrix factorization.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"155 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86299522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Macau: Scalable Bayesian factorization with high-dimensional side information using MCMC 澳门:基于MCMC的高维边信息的可扩展贝叶斯分解
J. Simm, Adam Arany, Pooya Zakeri, Tom Haber, J. Wegner, V. Chupakhin, H. Ceulemans, Y. Moreau
Bayesian matrix factorization is a method of choice for making predictions for large-scale incomplete matrices, due to availability of efficient Gibbs sampling schemes and its robustness to overfitting. In this paper, we consider factorization of large scale matrices with high-dimensional side information. However, sampling the link matrix for the side information with standard approaches costs O(F3) time, where F is the dimensionality of the features. To overcome this limitation we, firstly, propose a prior for the link matrix whose strength is proportional to the scale of latent variables. Secondly, using this prior we derive an efficient sampler, with linear complexity in the number of non-zeros, O(Nnz), by leveraging Krylov subspace methods, such as block conjugate gradient, allowing us to handle million-dimensional side information. We demonstrate the effectiveness of our proposed method in drug-protein interaction prediction task.
贝叶斯矩阵分解是对大规模不完全矩阵进行预测的一种选择方法,由于有效的吉布斯抽样方案的可用性及其对过拟合的鲁棒性。本文研究了具有高维边信息的大型矩阵的分解问题。然而,使用标准方法对边信息的链接矩阵进行采样需要O(F3)时间,其中F是特征的维数。为了克服这一限制,我们首先提出了链接矩阵的先验,其强度与潜在变量的规模成正比。其次,利用该先验,我们通过利用Krylov子空间方法(如块共轭梯度)推导出一个有效的采样器,其非零个数的线性复杂度为O(Nnz),允许我们处理百万维侧信息。我们证明了该方法在药物-蛋白质相互作用预测任务中的有效性。
{"title":"Macau: Scalable Bayesian factorization with high-dimensional side information using MCMC","authors":"J. Simm, Adam Arany, Pooya Zakeri, Tom Haber, J. Wegner, V. Chupakhin, H. Ceulemans, Y. Moreau","doi":"10.1109/MLSP.2017.8168143","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168143","url":null,"abstract":"Bayesian matrix factorization is a method of choice for making predictions for large-scale incomplete matrices, due to availability of efficient Gibbs sampling schemes and its robustness to overfitting. In this paper, we consider factorization of large scale matrices with high-dimensional side information. However, sampling the link matrix for the side information with standard approaches costs O(F3) time, where F is the dimensionality of the features. To overcome this limitation we, firstly, propose a prior for the link matrix whose strength is proportional to the scale of latent variables. Secondly, using this prior we derive an efficient sampler, with linear complexity in the number of non-zeros, O(Nnz), by leveraging Krylov subspace methods, such as block conjugate gradient, allowing us to handle million-dimensional side information. We demonstrate the effectiveness of our proposed method in drug-protein interaction prediction task.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"11 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86624646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation 一种用于单音歌唱声音分离的带跳过滤波连接的循环编码器-解码器方法
S. Mimilakis, K. Drossos, T. Virtanen, G. Schuller
The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral representations are then used to derive time-frequency masks. In this work we introduce a method to directly learn time-frequency masks from an observed mixture magnitude spectrum. We employ recurrent neural networks and train them using prior knowledge only for the magnitude spectrum of the target source. To assess the performance of the proposed method, we focus on the task of singing voice separation. The results from an objective evaluation show that our proposed method provides comparable results to deep learning based methods which operate over complicated signal representations. Compared to previous methods that approximate time-frequency masks, our method has increased performance of signal to distortion ratio by an average of 3.8 dB.
基于编码器-解码器架构的深度学习方法用于音乐源分离的目标是近似目标音乐源的理想时频掩码或频谱表示。然后使用谱表示来推导时频掩模。在这项工作中,我们介绍了一种从观测到的混合幅度谱中直接学习时频掩模的方法。我们使用递归神经网络,并使用仅针对目标源的幅度谱的先验知识来训练它们。为了评估所提出的方法的性能,我们将重点放在歌唱声音分离的任务上。客观评估的结果表明,我们提出的方法提供了与基于深度学习的方法相当的结果,这些方法处理复杂的信号表示。与以前近似时频掩模的方法相比,我们的方法将信号失真比的性能平均提高了3.8 dB。
{"title":"A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation","authors":"S. Mimilakis, K. Drossos, T. Virtanen, G. Schuller","doi":"10.1109/MLSP.2017.8168117","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168117","url":null,"abstract":"The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral representations are then used to derive time-frequency masks. In this work we introduce a method to directly learn time-frequency masks from an observed mixture magnitude spectrum. We employ recurrent neural networks and train them using prior knowledge only for the magnitude spectrum of the target source. To assess the performance of the proposed method, we focus on the task of singing voice separation. The results from an objective evaluation show that our proposed method provides comparable results to deep learning based methods which operate over complicated signal representations. Compared to previous methods that approximate time-frequency masks, our method has increased performance of signal to distortion ratio by an average of 3.8 dB.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"8 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82502899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Rao-Blackwellized particle mcmc for parameter estimation in spatio-temporal Gaussian processes 时空高斯过程参数估计的rao - blackwell化粒子mcmc
R. Hostettler, S. Särkkä, S. Godsill
In this paper, we consider parameter estimation in latent, spatiotemporal Gaussian processes using particle Markov chain Monte Carlo methods. In particular, we use spectral decomposition of the covariance function to obtain a high-dimensional state-space representation of the Gaussian processes, which is assumed to be observed through a nonlinear non-Gaussian likelihood. We develop a Rao-Blackwellized particle Gibbs sampler to sample the state trajectory and show how to sample the hyperparameters and possible parameters in the likelihood. The proposed method is evaluated on a spatio-temporal population model and the predictive performance is evaluated using leave-one-out cross-validation.
本文考虑了利用粒子马尔可夫链蒙特卡罗方法对潜在时空高斯过程进行参数估计。特别是,我们使用协方差函数的谱分解来获得高斯过程的高维状态空间表示,该过程被假设为通过非线性非高斯似然来观察。我们开发了一种Rao-Blackwellized粒子Gibbs采样器来对状态轨迹进行采样,并展示了如何在似然中对超参数和可能参数进行采样。在时空种群模型上对该方法进行了评估,并使用留一交叉验证对该方法的预测性能进行了评估。
{"title":"Rao-Blackwellized particle mcmc for parameter estimation in spatio-temporal Gaussian processes","authors":"R. Hostettler, S. Särkkä, S. Godsill","doi":"10.1109/MLSP.2017.8168171","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168171","url":null,"abstract":"In this paper, we consider parameter estimation in latent, spatiotemporal Gaussian processes using particle Markov chain Monte Carlo methods. In particular, we use spectral decomposition of the covariance function to obtain a high-dimensional state-space representation of the Gaussian processes, which is assumed to be observed through a nonlinear non-Gaussian likelihood. We develop a Rao-Blackwellized particle Gibbs sampler to sample the state trajectory and show how to sample the hyperparameters and possible parameters in the likelihood. The proposed method is evaluated on a spatio-temporal population model and the predictive performance is evaluated using leave-one-out cross-validation.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"44 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79348195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Leveraging deep neural networks with nonnegative representations for improved environmental sound classification 利用非负表示的深度神经网络改进环境声音分类
Victor Bisot, R. Serizel, S. Essid, G. Richard
This paper introduces the use of representations based on nonnegative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before training deep networks, whose usefulness is highlighted in this paper, especially for multi-source acoustic environments such as sound scenes. We rely on two established unsupervised and supervised NMF techniques to learn better input representations for deep neural networks. This will allow us, with simple architectures, to reach competitive performance with more complex systems such as convolutional networks for acoustic scene classification. The proposed systems outperform neural networks trained on time-frequency representations on two acoustic scene classification datasets as well as the best systems from the 2016 DCASE challenge.
本文介绍了利用基于非负矩阵分解(NMF)的表征方法训练深度神经网络,并将其应用于环境声音分类。用于声音分类的深度学习系统通常依赖于网络从频谱图或手工特征中学习有意义的表示。相反,我们在训练深度网络之前引入了一个基于nmf的特征学习阶段,本文强调了其实用性,特别是对于多源声环境(如声音场景)。我们依靠两种已建立的无监督和有监督NMF技术来学习深度神经网络的更好的输入表示。这将使我们能够使用简单的架构,与更复杂的系统(如用于声学场景分类的卷积网络)达到竞争性能。所提出的系统在两个声学场景分类数据集上的表现优于经过时频表示训练的神经网络,以及2016年DCASE挑战赛中的最佳系统。
{"title":"Leveraging deep neural networks with nonnegative representations for improved environmental sound classification","authors":"Victor Bisot, R. Serizel, S. Essid, G. Richard","doi":"10.1109/MLSP.2017.8168139","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168139","url":null,"abstract":"This paper introduces the use of representations based on nonnegative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before training deep networks, whose usefulness is highlighted in this paper, especially for multi-source acoustic environments such as sound scenes. We rely on two established unsupervised and supervised NMF techniques to learn better input representations for deep neural networks. This will allow us, with simple architectures, to reach competitive performance with more complex systems such as convolutional networks for acoustic scene classification. The proposed systems outperform neural networks trained on time-frequency representations on two acoustic scene classification datasets as well as the best systems from the 2016 DCASE challenge.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"67 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88044382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1