首页 > 最新文献

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

英文 中文
A diagonal plus low-rank covariance model for computationally efficient source separation 一个对角线加低秩协方差模型计算有效的源分离
A. Liutkus, Kazuyoshi Yoshii
This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.
提出了一种加速版的正半定张量分解法(PSDTF)用于盲源分离。PSDTF的工作优于非负矩阵分解(NMF),因为它放弃了音频信号可以通过短期傅里叶变换(STFT)在频域白化的假设。事实上,这个假设只在一种理想情况下成立,即每一帧都是无限长的,目标信号在每一帧中都是完全静止的。因此,PSDTF处理频率箱上的完整协方差矩阵,而不是像NMF那样强迫它们是对角的。尽管PSDTF在分离性能上明显优于NMF,但由于大协方差矩阵的重复反演,它的计算成本很高。为了解决这一问题,我们提出了一种基于对角加低秩协方差矩阵的中间模型,并推导了有效更新PSDTF参数的期望最大化(EM)算法。实验结果表明,我们的方法可以在不显著降低分离性能的情况下,将PSDTF的复杂度显著降低几个数量级。
{"title":"A diagonal plus low-rank covariance model for computationally efficient source separation","authors":"A. Liutkus, Kazuyoshi Yoshii","doi":"10.1109/MLSP.2017.8168169","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168169","url":null,"abstract":"This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"34 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88310167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Partitioning in signal processing using the object migration automaton and the pursuit paradigm 用对象迁移自动机和追踪范式划分信号处理
Abdolreza Shirvani, B. Oommen
Data in all Signal Processing (SP) applications is being generated super-exponentially, and at an ever increasing rate. A meaningful way to pre-process it so as to achieve feasible computation is by Partitioning the data [5]. Indeed, the task of partitioning is one of the most difficult problems in computing, and it has extensive applications in solving real-life problems, especially when the amount of SP data (i.e., images, voices, speakers, libraries etc.) to be processed is prohibitively large. The problem is known to be NP-hard. The benchmark solution for this for the Equi-partitioning Problem (EPP) has involved the classic field of Learning Automata (LA), and the corresponding algorithm, the Object Migrating Automata (OMA) has been used in numerous application domains. While the OMA is a fixed structure machine, it does not incorporate the Pursuit concept that has, recently, significantly enhanced the field of LA. In this paper, we pioneer the incorporation of the Pursuit concept into the OMA. We do this by a non-intuitive paradigm, namely that of removing (or discarding) from the query stream, queries that could be counter-productive. This can be perceived as a filtering agent triggered by a pursuit-based module. The resulting machine, referred to as the Pursuit OMA (POMA), has been rigorously tested in all the standard benchmark environments. Indeed, in certain extreme environments it is almost ten times faster than the original OMA. The application of the POMA to all signal processing applications is extremely promising.
所有信号处理(SP)应用中的数据都在以超级指数级的速度增长。对数据进行预处理以实现可行的计算是一种有意义的方法[5]。事实上,分区任务是计算中最困难的问题之一,它在解决现实问题方面有广泛的应用,特别是当要处理的SP数据(即图像、声音、扬声器、库等)的数量非常大时。这个问题被称为NP-hard。针对等分割问题(EPP)的基准解决方案涉及到学习自动机(LA)的经典领域,而相应的算法——对象迁移自动机(OMA)已经在许多应用领域得到了应用。虽然OMA是一个固定结构的机器,但它并没有融入最近在LA领域得到显著提升的Pursuit概念。在本文中,我们率先将追求概念纳入OMA。我们通过一种非直观的范例来做到这一点,即从查询流中删除(或丢弃)可能适得其反的查询。这可以看作是由基于追踪的模块触发的过滤代理。生成的机器称为Pursuit OMA (POMA),已经在所有标准基准测试环境中进行了严格的测试。事实上,在某些极端环境下,它的速度几乎是原始OMA的十倍。POMA在所有信号处理应用中的应用是非常有前途的。
{"title":"Partitioning in signal processing using the object migration automaton and the pursuit paradigm","authors":"Abdolreza Shirvani, B. Oommen","doi":"10.1109/MLSP.2017.8168149","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168149","url":null,"abstract":"Data in all Signal Processing (SP) applications is being generated super-exponentially, and at an ever increasing rate. A meaningful way to pre-process it so as to achieve feasible computation is by Partitioning the data [5]. Indeed, the task of partitioning is one of the most difficult problems in computing, and it has extensive applications in solving real-life problems, especially when the amount of SP data (i.e., images, voices, speakers, libraries etc.) to be processed is prohibitively large. The problem is known to be NP-hard. The benchmark solution for this for the Equi-partitioning Problem (EPP) has involved the classic field of Learning Automata (LA), and the corresponding algorithm, the Object Migrating Automata (OMA) has been used in numerous application domains. While the OMA is a fixed structure machine, it does not incorporate the Pursuit concept that has, recently, significantly enhanced the field of LA. In this paper, we pioneer the incorporation of the Pursuit concept into the OMA. We do this by a non-intuitive paradigm, namely that of removing (or discarding) from the query stream, queries that could be counter-productive. This can be perceived as a filtering agent triggered by a pursuit-based module. The resulting machine, referred to as the Pursuit OMA (POMA), has been rigorously tested in all the standard benchmark environments. Indeed, in certain extreme environments it is almost ten times faster than the original OMA. The application of the POMA to all signal processing applications is extremely promising.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"440 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73598761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models 基于基函数的混合噪声信号的生成及基于dnn的语音增强模型的学习
Shi-Xue Wen, Jun Du, Chin-Hui Lee
We first examine the generalization issue with the noise samples used in training nonlinear mapping functions between noisy and clean speech features for deep neural network (DNN) based speech enhancement. Then an empirical proof is established to explain why the DNN-based approach has a good noise generalization capability provided that a large collection of noise types are included in generating diverse noisy speech samples for training. It is shown that an arbitrary noise signal segment can be well represented by a linear combination of microstructure noise bases. Accordingly, we propose to generate these mixing noise signals by designing a set of compact and analytic noise bases without using any realistic noise types. The experiments demonstrate that this noise generation scheme can yield comparable performance to that using 50 real noise types. Furthermore, by supplementing the collected noise types with the synthesized noise bases, we observe remarkable performance improvements implying that not only a large collection of real-world noise signals can be alleviated, but also a good noise generalization capability can be achieved.
我们首先研究了基于深度神经网络(DNN)的语音增强中用于训练噪声和干净语音特征之间非线性映射函数的噪声样本的泛化问题。然后建立了一个经验证明来解释为什么基于dnn的方法具有良好的噪声泛化能力,前提是在生成用于训练的各种噪声语音样本时包含大量噪声类型。结果表明,任意噪声信号段都可以用微结构噪声基的线性组合来表示。因此,我们建议在不使用任何实际噪声类型的情况下,通过设计一套紧凑的解析噪声基来产生这些混合噪声信号。实验表明,该噪声生成方案与使用50种真实噪声类型的噪声生成方案具有相当的性能。此外,通过将收集到的噪声类型与合成的噪声基相补充,我们观察到显著的性能改进,这意味着不仅可以减轻大量真实噪声信号的收集,而且可以实现良好的噪声泛化能力。
{"title":"On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models","authors":"Shi-Xue Wen, Jun Du, Chin-Hui Lee","doi":"10.1109/MLSP.2017.8168192","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168192","url":null,"abstract":"We first examine the generalization issue with the noise samples used in training nonlinear mapping functions between noisy and clean speech features for deep neural network (DNN) based speech enhancement. Then an empirical proof is established to explain why the DNN-based approach has a good noise generalization capability provided that a large collection of noise types are included in generating diverse noisy speech samples for training. It is shown that an arbitrary noise signal segment can be well represented by a linear combination of microstructure noise bases. Accordingly, we propose to generate these mixing noise signals by designing a set of compact and analytic noise bases without using any realistic noise types. The experiments demonstrate that this noise generation scheme can yield comparable performance to that using 50 real noise types. Furthermore, by supplementing the collected noise types with the synthesized noise bases, we observe remarkable performance improvements implying that not only a large collection of real-world noise signals can be alleviated, but also a good noise generalization capability can be achieved.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"110 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75628645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Navigation-Based learning for survey trajectory classification in autonomous underwater vehicles 基于导航学习的自主水下航行器测量轨迹分类
M. D. L. Alvarez, H. Hastie, D. Lane
Timeseries sensor data processing is indispensable for system monitoring. Working with autonomous vehicles requires mechanisms that provide insightful information about the status of a mission. In a setting where time and resources are limited, trajectory classification plays a vital role in mission monitoring and failure detection. In this context, we use navigational data to interpret trajectory patterns and classify them. We implement Long Short-Term Memory (LSTM) based Recursive Neural Networks (RNN) that learn the most commonly used survey trajectory patterns from surveys executed by two types of Autonomous Underwater Vehicles (AUV). We compare the performance of our network against baseline machine learning methods.
时间序列传感器数据处理是系统监测必不可少的环节。与自动驾驶汽车合作需要提供有关任务状态的深刻信息的机制。在时间和资源有限的情况下,弹道分类在任务监测和故障检测中起着至关重要的作用。在这种情况下,我们使用导航数据来解释轨迹模式并对它们进行分类。我们实现了基于长短期记忆(LSTM)的递归神经网络(RNN),该网络从两种类型的自主水下航行器(AUV)执行的调查中学习最常用的调查轨迹模式。我们将网络的性能与基准机器学习方法进行比较。
{"title":"Navigation-Based learning for survey trajectory classification in autonomous underwater vehicles","authors":"M. D. L. Alvarez, H. Hastie, D. Lane","doi":"10.1109/MLSP.2017.8168137","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168137","url":null,"abstract":"Timeseries sensor data processing is indispensable for system monitoring. Working with autonomous vehicles requires mechanisms that provide insightful information about the status of a mission. In a setting where time and resources are limited, trajectory classification plays a vital role in mission monitoring and failure detection. In this context, we use navigational data to interpret trajectory patterns and classify them. We implement Long Short-Term Memory (LSTM) based Recursive Neural Networks (RNN) that learn the most commonly used survey trajectory patterns from surveys executed by two types of Autonomous Underwater Vehicles (AUV). We compare the performance of our network against baseline machine learning methods.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"43 4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80433944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization 判别非负矩阵分解的广义倒谱正则化
Li Li, H. Kameoka, S. Makino
The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.
非负矩阵分解(NMF)方法已被证明在单语言语音增强任务中工作得相当好。本文提出解决原NMF方法的两个缺点:(1)基训练和分离(维纳滤波)的目标函数不一致(基谱未经过训练,分离后的信号成为最优);(2)最小化谱散度措施并不一定会导致特征域(例如,倒谱域)或感知质量的增强。为了解决第一个缺点,我们之前提出了一种判别NMF (DNMF)算法,该算法为基础训练和分离优化相同的目标。为了解决第二个缺点,我们之前引入了新的框架,称为倒谱距离正则化NMF (CDRNMF)和mel-广义倒谱距离正则化NMF (MGCRNMF),其目的是在谱域和特征域增强语音。本文通过将MGC正则化器引入DNMF目标函数,提出DNMF和MGCRNMF目标的结合,并提出了一种参数估计算法。实验结果表明,该方法优于基线方法。
{"title":"Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization","authors":"Li Li, H. Kameoka, S. Makino","doi":"10.1109/MLSP.2017.8168142","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168142","url":null,"abstract":"The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"2014 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88132290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast algorithm using summed area tables with unified layer performing convolution and average pooling 快速算法使用求和面积表与统一层执行卷积和平均池化
Akihiko Kasagi, T. Tabaru, H. Tamura
Convolutional neural networks (CNNs), in which several convolutional layers extract feature patterns from an input image, are one of the most popular network architectures used for image classification. The convolutional computation, however, requires a high computational cost, resulting in an increased power consumption and processing time. In this paper, we propose a novel algorithm that substitutes a single layer for a pair formed by a convolutional layer and the following average-pooling layer. The key idea of the proposed scheme is to compute the output of the pair of original layers without the computation of convolution. To achieve this end, our algorithm generates summed area tables (SATs) of input images first and directly computes the output values from the SATs. We implemented our algorithm for forward propagation and backward propagation to evaluate the performance. Our experimental results showed that our algorithm achieved 17.1 times faster performance than the original algorithm for the same parameter used in ResNet-34.
卷积神经网络(cnn)是最流行的用于图像分类的网络体系结构之一,其中几个卷积层从输入图像中提取特征模式。然而,卷积计算需要很高的计算成本,从而导致功耗和处理时间的增加。在本文中,我们提出了一种新的算法,用一个单层代替由卷积层和下面的平均池化层组成的一对。该方案的关键思想是在不计算卷积的情况下计算原始层对的输出。为了实现这一目的,我们的算法首先生成输入图像的求和面积表(SATs),并直接计算SATs的输出值。我们实现了前向传播和后向传播算法来评估性能。实验结果表明,在ResNet-34中使用的相同参数下,我们的算法比原始算法的性能提高了17.1倍。
{"title":"Fast algorithm using summed area tables with unified layer performing convolution and average pooling","authors":"Akihiko Kasagi, T. Tabaru, H. Tamura","doi":"10.1109/MLSP.2017.8168154","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168154","url":null,"abstract":"Convolutional neural networks (CNNs), in which several convolutional layers extract feature patterns from an input image, are one of the most popular network architectures used for image classification. The convolutional computation, however, requires a high computational cost, resulting in an increased power consumption and processing time. In this paper, we propose a novel algorithm that substitutes a single layer for a pair formed by a convolutional layer and the following average-pooling layer. The key idea of the proposed scheme is to compute the output of the pair of original layers without the computation of convolution. To achieve this end, our algorithm generates summed area tables (SATs) of input images first and directly computes the output values from the SATs. We implemented our algorithm for forward propagation and backward propagation to evaluate the performance. Our experimental results showed that our algorithm achieved 17.1 times faster performance than the original algorithm for the same parameter used in ResNet-34.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"15 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84937810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Discriminating schizophrenia from normal controls using resting state functional network connectivity: A deep neural network and layer-wise relevance propagation method 利用静息状态功能网络连接区分精神分裂症和正常对照:一种深度神经网络和分层相关传播方法
Weizheng Yan, S. Plis, V. Calhoun, Shengfeng Liu, R. Jiang, T. Jiang, J. Sui
Deep learning has gained considerable attention in the scientific community, breaking benchmark records in many fields such as speech and visual recognition [1]. Motivated by extending advancement of deep learning approaches to brain imaging classification, we propose a framework, called “deep neural network (DNN)+ layer-wise relevance propagation (LRP)”, to distinguish schizophrenia patients (SZ) from healthy controls (HCs) using functional network connectivity (FNC). 1100 Chinese subjects of 7 sites are included, each with a 50∗50 FNC matrix resulted from group ICA on resting-state fMRI data. The proposed DNN+LRP not only improves classification accuracy significantly compare to four state-of-the-art classification methods (84% vs. less than 79%, 10 folds cross validation) but also enables identification of the most contributing FNC patterns related to SZ classification, which cannot be easily traced back by general DNN models. By conducting LRP, we identified the FNC patterns that exhibit the highest discriminative power in SZ classification. More importantly, when using leave-one-site-out cross validation (using 6 sites for training, 1 site for testing, 7 times in total), the cross-site classification accuracy reached 82%, suggesting high robustness and generalization performance of the proposed method, promising a wide utility in the community and great potentials for biomarker identification of brain disorders.
深度学习在科学界引起了相当大的关注,在语音和视觉识别等许多领域都打破了基准记录[1]。基于深度学习方法在脑成像分类中的扩展进展,我们提出了一个名为“深度神经网络(DNN)+分层相关传播(LRP)”的框架,利用功能网络连接(FNC)来区分精神分裂症患者(SZ)和健康对照(hc)。共纳入7个站点的1100名中国受试者,每个受试者的静息态fMRI数据的分组ICA结果为50 * 50 FNC矩阵。与四种最先进的分类方法(84% vs.低于79%,10倍交叉验证)相比,所提出的DNN+LRP不仅显著提高了分类精度,而且能够识别与SZ分类相关的最重要的FNC模式,这些模式无法通过一般DNN模型轻松追溯。通过LRP,我们发现了在SZ分类中表现出最高判别能力的FNC模式。更重要的是,当使用留一站点交叉验证(6个站点进行训练,1个站点进行测试,共7次)时,跨站点分类准确率达到82%,表明所提出的方法具有较高的鲁棒性和泛化性能,在社会上具有广泛的实用性,在脑部疾病的生物标志物鉴定方面具有很大的潜力。
{"title":"Discriminating schizophrenia from normal controls using resting state functional network connectivity: A deep neural network and layer-wise relevance propagation method","authors":"Weizheng Yan, S. Plis, V. Calhoun, Shengfeng Liu, R. Jiang, T. Jiang, J. Sui","doi":"10.1109/MLSP.2017.8168179","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168179","url":null,"abstract":"Deep learning has gained considerable attention in the scientific community, breaking benchmark records in many fields such as speech and visual recognition [1]. Motivated by extending advancement of deep learning approaches to brain imaging classification, we propose a framework, called “deep neural network (DNN)+ layer-wise relevance propagation (LRP)”, to distinguish schizophrenia patients (SZ) from healthy controls (HCs) using functional network connectivity (FNC). 1100 Chinese subjects of 7 sites are included, each with a 50∗50 FNC matrix resulted from group ICA on resting-state fMRI data. The proposed DNN+LRP not only improves classification accuracy significantly compare to four state-of-the-art classification methods (84% vs. less than 79%, 10 folds cross validation) but also enables identification of the most contributing FNC patterns related to SZ classification, which cannot be easily traced back by general DNN models. By conducting LRP, we identified the FNC patterns that exhibit the highest discriminative power in SZ classification. More importantly, when using leave-one-site-out cross validation (using 6 sites for training, 1 site for testing, 7 times in total), the cross-site classification accuracy reached 82%, suggesting high robustness and generalization performance of the proposed method, promising a wide utility in the community and great potentials for biomarker identification of brain disorders.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75381907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
MIML-AI: Mixed-supervision multi-instance multi-label learning with auxiliary information MIML-AI:带辅助信息的混合监督多实例多标签学习
Tarn Nguyen, R. Raich, Xiaoli Z. Fern, Anh T. Pham
Manual labeling of individual instances is time-consuming. This is commonly resolved by labeling a bag-of-instances with a single common label or label-set. However, this approach is still time-costly for large datasets. In this paper, we propose a mixed-supervision multi-instance multi-label learning model for learning from easily available meta data information (MIML-AI). This auxiliary information is normally collected automatically with the data, e.g., an image location information or a document author name. We propose a discriminative graphical model with exact inferences to train a classifier based on auxiliary label information and a small number of labeled bags. This strategy utilizes meta data as means of providing a weaker label as an alternative to intensive manual labeling. Experiment on real data illustrates the effectiveness of our proposed method relative to current approaches, which do not use the information from bags that contain only meta-data label information.
手动标记单个实例非常耗时。这个问题通常通过使用单个公共标签或标签集标记实例包来解决。然而,这种方法对于大型数据集来说仍然是费时的。在本文中,我们提出了一种混合监督多实例多标签学习模型,用于从容易获得的元数据信息中学习(MIML-AI)。该辅助信息通常与数据一起自动收集,例如,图像位置信息或文档作者姓名。我们提出了一种具有精确推理的判别图形模型来训练基于辅助标签信息和少量标签袋的分类器。该策略利用元数据作为提供较弱标签的手段,作为密集手动标签的替代方法。对真实数据的实验证明了我们提出的方法相对于当前方法的有效性,这些方法不使用仅包含元数据标签信息的袋子中的信息。
{"title":"MIML-AI: Mixed-supervision multi-instance multi-label learning with auxiliary information","authors":"Tarn Nguyen, R. Raich, Xiaoli Z. Fern, Anh T. Pham","doi":"10.1109/MLSP.2017.8168107","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168107","url":null,"abstract":"Manual labeling of individual instances is time-consuming. This is commonly resolved by labeling a bag-of-instances with a single common label or label-set. However, this approach is still time-costly for large datasets. In this paper, we propose a mixed-supervision multi-instance multi-label learning model for learning from easily available meta data information (MIML-AI). This auxiliary information is normally collected automatically with the data, e.g., an image location information or a document author name. We propose a discriminative graphical model with exact inferences to train a classifier based on auxiliary label information and a small number of labeled bags. This strategy utilizes meta data as means of providing a weaker label as an alternative to intensive manual labeling. Experiment on real data illustrates the effectiveness of our proposed method relative to current approaches, which do not use the information from bags that contain only meta-data label information.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"146 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78590850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infinite probabilistic latent component analysis for audio source separation 音频源分离的无限概率潜在分量分析
Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto
This paper presents a statistical method of audio source separation based on a nonparametric Bayesian extension of probabilistic latent component analysis (PLCA). A major approach to audio source separation is to use nonnegative matrix factorization (NMF) that approximates the magnitude spectrum of a mixture signal at each frame as the weighted sum of fewer source spectra. Another approach is to use PLCA that regards the magnitude spectrogram as a two-dimensional histogram of “sound quanta” and classifies each quantum into one of sources. While NMF has a physically-natural interpretation, PLCA has been used successfully for music signal analysis. To enable PLCA to estimate the number of sources, we propose Dirichlet process PLCA (DP-PLCA) and derive two kinds of learning methods based on variational Bayes and collapsed Gibbs sampling. Unlike existing learning methods for nonparametric Bayesian NMF based on the beta or gamma processes (BP-NMF and GaP-NMF), our sampling method can efficiently search for the optimal number of sources without truncating the number of sources to be considered. Experimental results showed that DP-PLCA is superior to GaP-NMF in terms of source number estimation.
提出了一种基于概率潜分量分析的非参数贝叶斯扩展的音频源分离统计方法。音频源分离的主要方法是使用非负矩阵分解(NMF),该方法将混合信号在每帧处的幅度谱近似为较少源谱的加权和。另一种方法是使用PLCA,它将幅度谱图视为“声音量子”的二维直方图,并将每个量子分类为一个源。虽然NMF具有物理-自然解释,但PLCA已成功用于音乐信号分析。为了使PLCA能够估计源的数量,我们提出了Dirichlet过程PLCA (DP-PLCA),并推导了两种基于变分贝叶斯和崩溃吉布斯抽样的学习方法。与现有的基于beta或gamma过程(BP-NMF和GaP-NMF)的非参数贝叶斯NMF学习方法不同,我们的采样方法可以有效地搜索最优的源数量,而不会截断要考虑的源数量。实验结果表明,DP-PLCA算法在源数估计方面优于GaP-NMF算法。
{"title":"Infinite probabilistic latent component analysis for audio source separation","authors":"Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto","doi":"10.1109/MLSP.2017.8168189","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168189","url":null,"abstract":"This paper presents a statistical method of audio source separation based on a nonparametric Bayesian extension of probabilistic latent component analysis (PLCA). A major approach to audio source separation is to use nonnegative matrix factorization (NMF) that approximates the magnitude spectrum of a mixture signal at each frame as the weighted sum of fewer source spectra. Another approach is to use PLCA that regards the magnitude spectrogram as a two-dimensional histogram of “sound quanta” and classifies each quantum into one of sources. While NMF has a physically-natural interpretation, PLCA has been used successfully for music signal analysis. To enable PLCA to estimate the number of sources, we propose Dirichlet process PLCA (DP-PLCA) and derive two kinds of learning methods based on variational Bayes and collapsed Gibbs sampling. Unlike existing learning methods for nonparametric Bayesian NMF based on the beta or gamma processes (BP-NMF and GaP-NMF), our sampling method can efficiently search for the optimal number of sources without truncating the number of sources to be considered. Experimental results showed that DP-PLCA is superior to GaP-NMF in terms of source number estimation.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"15 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87005228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Difference-of-Convex optimization for variational kl-corrected inference in dirichlet process mixtures dirichlet过程混合中变分kl校正推理的凸差分优化
Rasmus Bonnevie, Mikkel N. Schmidt, Morten Mørup
Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space is marginalized. We consider the KL-corrected collapsed variational bound and apply it to Dirichlet process mixture models, allowing us to reduce the optimization space considerably. We find that the variational bound exhibits consistent and exploitable structure, allowing the application of difference-of-convex optimization algorithms. We show how this yields an interpretable fixed-point update algorithm in the collapsed setting for the Dirichlet process mixture model. We connect this update formula to classical coordinate ascent updates, illustrating that the proposed improvement surprisingly reduces to the traditional scheme.
贝叶斯模型中近似推理的变分方法优化了边际似然的下界,但优化问题往往是非凸的和高维的。这可以通过在参数空间的一部分被边缘化的折叠域中工作来缓解。我们考虑了kl校正的崩溃变分界,并将其应用于Dirichlet过程混合模型,使我们能够大大减少优化空间。我们发现变分界具有一致性和可开发的结构,允许应用凸差分优化算法。我们将展示这如何在Dirichlet过程混合模型的折叠设置中产生可解释的定点更新算法。我们将此更新公式与经典坐标上升更新联系起来,说明所提出的改进令人惊讶地简化为传统方案。
{"title":"Difference-of-Convex optimization for variational kl-corrected inference in dirichlet process mixtures","authors":"Rasmus Bonnevie, Mikkel N. Schmidt, Morten Mørup","doi":"10.1109/MLSP.2017.8168159","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168159","url":null,"abstract":"Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space is marginalized. We consider the KL-corrected collapsed variational bound and apply it to Dirichlet process mixture models, allowing us to reduce the optimization space considerably. We find that the variational bound exhibits consistent and exploitable structure, allowing the application of difference-of-convex optimization algorithms. We show how this yields an interpretable fixed-point update algorithm in the collapsed setting for the Dirichlet process mixture model. We connect this update formula to classical coordinate ascent updates, illustrating that the proposed improvement surprisingly reduces to the traditional scheme.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"3 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85756170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1