首页 > 最新文献

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

英文 中文
Dictionary learning for pitch estimation in speech signals 语音信号中音调估计的字典学习
F. Huang, P. Balázs
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.
本文提出了一种基于稀疏性的基音估计方法的参数自动训练方法。对于这种基音估计方法,谐波字典是一个需要事先精心准备的关键参数。在最初的方法中,需要大量的人工监督和参与来构建和标记词典。在本研究中,我们提出使用字典学习算法直接从训练数据中学习字典。我们应用并比较了优化方向法(MOD)、K-SVD和在线字典学习(ODL) 3种典型的字典学习算法,并提出了一种后处理方法来标记和调整学习到的字典用于音高估计。结果表明,MOD和适当初始化的ODL (pi-ODL)可以得到具有所需谐波结构的字典,并且后处理方法可以显著提高学习到的字典在基音估计中的性能。使用pi-ODL和后处理获得的字典获得的基音估计精度接近手动字典的最佳性能。结果表明,字典学习在这一应用中是可行的。
{"title":"Dictionary learning for pitch estimation in speech signals","authors":"F. Huang, P. Balázs","doi":"10.1109/MLSP.2017.8168173","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168173","url":null,"abstract":"This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"49 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88947433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Difference-of-Convex optimization for variational kl-corrected inference in dirichlet process mixtures dirichlet过程混合中变分kl校正推理的凸差分优化
Rasmus Bonnevie, Mikkel N. Schmidt, Morten Mørup
Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space is marginalized. We consider the KL-corrected collapsed variational bound and apply it to Dirichlet process mixture models, allowing us to reduce the optimization space considerably. We find that the variational bound exhibits consistent and exploitable structure, allowing the application of difference-of-convex optimization algorithms. We show how this yields an interpretable fixed-point update algorithm in the collapsed setting for the Dirichlet process mixture model. We connect this update formula to classical coordinate ascent updates, illustrating that the proposed improvement surprisingly reduces to the traditional scheme.
贝叶斯模型中近似推理的变分方法优化了边际似然的下界,但优化问题往往是非凸的和高维的。这可以通过在参数空间的一部分被边缘化的折叠域中工作来缓解。我们考虑了kl校正的崩溃变分界,并将其应用于Dirichlet过程混合模型,使我们能够大大减少优化空间。我们发现变分界具有一致性和可开发的结构,允许应用凸差分优化算法。我们将展示这如何在Dirichlet过程混合模型的折叠设置中产生可解释的定点更新算法。我们将此更新公式与经典坐标上升更新联系起来,说明所提出的改进令人惊讶地简化为传统方案。
{"title":"Difference-of-Convex optimization for variational kl-corrected inference in dirichlet process mixtures","authors":"Rasmus Bonnevie, Mikkel N. Schmidt, Morten Mørup","doi":"10.1109/MLSP.2017.8168159","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168159","url":null,"abstract":"Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space is marginalized. We consider the KL-corrected collapsed variational bound and apply it to Dirichlet process mixture models, allowing us to reduce the optimization space considerably. We find that the variational bound exhibits consistent and exploitable structure, allowing the application of difference-of-convex optimization algorithms. We show how this yields an interpretable fixed-point update algorithm in the collapsed setting for the Dirichlet process mixture model. We connect this update formula to classical coordinate ascent updates, illustrating that the proposed improvement surprisingly reduces to the traditional scheme.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"3 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85756170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infinite probabilistic latent component analysis for audio source separation 音频源分离的无限概率潜在分量分析
Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto
This paper presents a statistical method of audio source separation based on a nonparametric Bayesian extension of probabilistic latent component analysis (PLCA). A major approach to audio source separation is to use nonnegative matrix factorization (NMF) that approximates the magnitude spectrum of a mixture signal at each frame as the weighted sum of fewer source spectra. Another approach is to use PLCA that regards the magnitude spectrogram as a two-dimensional histogram of “sound quanta” and classifies each quantum into one of sources. While NMF has a physically-natural interpretation, PLCA has been used successfully for music signal analysis. To enable PLCA to estimate the number of sources, we propose Dirichlet process PLCA (DP-PLCA) and derive two kinds of learning methods based on variational Bayes and collapsed Gibbs sampling. Unlike existing learning methods for nonparametric Bayesian NMF based on the beta or gamma processes (BP-NMF and GaP-NMF), our sampling method can efficiently search for the optimal number of sources without truncating the number of sources to be considered. Experimental results showed that DP-PLCA is superior to GaP-NMF in terms of source number estimation.
提出了一种基于概率潜分量分析的非参数贝叶斯扩展的音频源分离统计方法。音频源分离的主要方法是使用非负矩阵分解(NMF),该方法将混合信号在每帧处的幅度谱近似为较少源谱的加权和。另一种方法是使用PLCA,它将幅度谱图视为“声音量子”的二维直方图,并将每个量子分类为一个源。虽然NMF具有物理-自然解释,但PLCA已成功用于音乐信号分析。为了使PLCA能够估计源的数量,我们提出了Dirichlet过程PLCA (DP-PLCA),并推导了两种基于变分贝叶斯和崩溃吉布斯抽样的学习方法。与现有的基于beta或gamma过程(BP-NMF和GaP-NMF)的非参数贝叶斯NMF学习方法不同,我们的采样方法可以有效地搜索最优的源数量,而不会截断要考虑的源数量。实验结果表明,DP-PLCA算法在源数估计方面优于GaP-NMF算法。
{"title":"Infinite probabilistic latent component analysis for audio source separation","authors":"Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto","doi":"10.1109/MLSP.2017.8168189","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168189","url":null,"abstract":"This paper presents a statistical method of audio source separation based on a nonparametric Bayesian extension of probabilistic latent component analysis (PLCA). A major approach to audio source separation is to use nonnegative matrix factorization (NMF) that approximates the magnitude spectrum of a mixture signal at each frame as the weighted sum of fewer source spectra. Another approach is to use PLCA that regards the magnitude spectrogram as a two-dimensional histogram of “sound quanta” and classifies each quantum into one of sources. While NMF has a physically-natural interpretation, PLCA has been used successfully for music signal analysis. To enable PLCA to estimate the number of sources, we propose Dirichlet process PLCA (DP-PLCA) and derive two kinds of learning methods based on variational Bayes and collapsed Gibbs sampling. Unlike existing learning methods for nonparametric Bayesian NMF based on the beta or gamma processes (BP-NMF and GaP-NMF), our sampling method can efficiently search for the optimal number of sources without truncating the number of sources to be considered. Experimental results showed that DP-PLCA is superior to GaP-NMF in terms of source number estimation.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"15 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87005228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving image classification with frequency domain layers for feature extraction 改进图像分类的频域层特征提取
J. Stuchi, M. A. Angeloni, R. F. Pereira, L. Boccato, G. Folego, Paulo V. S. Prado, R. Attux
Machine learning has been increasingly used in current days. Great improvements, especially in deep neural networks, helped to boost the achievable performance in computer vision and signal processing applications. Although different techniques were applied for deep architectures, the frequency domain has not been thoroughly explored in this field. In this context, this paper presents a new method for extracting discriminative features according to the Fourier analysis. The proposed frequency extractor layer can be combined with deep architectures in order to improve image classification. Computational experiments were performed on face liveness detection problem, yielding better results than those presented in the literature for the grandtest protocol of Replay-Attack Database. This paper also aims to raise the discussion on how frequency domain layers can be used in deep architectures to further improve the network performance.
如今,机器学习的应用越来越广泛。巨大的改进,特别是在深度神经网络方面,有助于提高计算机视觉和信号处理应用的可实现性能。尽管在深度体系结构中应用了不同的技术,但在该领域的频域尚未得到充分的探索。在此背景下,本文提出了一种基于傅里叶分析的判别特征提取新方法。所提出的频率提取层可以与深度结构相结合,以提高图像分类能力。对人脸活动性检测问题进行了计算实验,对于重放攻击数据库的最优协议,得到了比文献中更好的结果。本文还提出了如何在深度架构中使用频域层以进一步提高网络性能的讨论。
{"title":"Improving image classification with frequency domain layers for feature extraction","authors":"J. Stuchi, M. A. Angeloni, R. F. Pereira, L. Boccato, G. Folego, Paulo V. S. Prado, R. Attux","doi":"10.1109/MLSP.2017.8168168","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168168","url":null,"abstract":"Machine learning has been increasingly used in current days. Great improvements, especially in deep neural networks, helped to boost the achievable performance in computer vision and signal processing applications. Although different techniques were applied for deep architectures, the frequency domain has not been thoroughly explored in this field. In this context, this paper presents a new method for extracting discriminative features according to the Fourier analysis. The proposed frequency extractor layer can be combined with deep architectures in order to improve image classification. Computational experiments were performed on face liveness detection problem, yielding better results than those presented in the literature for the grandtest protocol of Replay-Attack Database. This paper also aims to raise the discussion on how frequency domain layers can be used in deep architectures to further improve the network performance.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90542996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Text to image generative model using constrained embedding space mapping 文本到图像的约束嵌入空间映射生成模型
Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana
We present a conditional generative method that maps low-dimensional embeddings of image and natural language to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. Based on this, we present a method to learn the conditional probability distribution of the two embedding spaces; first, by mapping them to a shared latent space and generating back the individual embeddings from this common space. However, in order to enable independent conditional inference for separately extracting the corresponding embeddings from the common latent space representation, we deploy a proxy variable trick — wherein, the single shared latent space is replaced by two separate latent spaces. We design an objective function, such that, during training we can force these separate spaces to lie close to each other, by minimizing the Euclidean distance between their distribution functions. Experimental results demonstrate that the learned joint model can generalize to learning concepts of double MNIST digits with additional attributes of colors, thereby enabling the generation of specific colored images from the respective text data.
我们提出了一种条件生成方法,将图像和自然语言的低维嵌入映射到一个共同的潜在空间,从而提取它们之间的语义关系。首先提取特定于模态的嵌入,然后执行约束优化程序将两个嵌入空间投影到公共流形。在此基础上,提出了一种学习两个嵌入空间的条件概率分布的方法;首先,将它们映射到一个共享的潜在空间,并从这个公共空间生成单个嵌入。然而,为了使独立的条件推理能够从公共潜在空间表示中单独提取相应的嵌入,我们部署了一个代理变量技巧——其中,单个共享潜在空间被两个单独的潜在空间取代。我们设计了一个目标函数,这样,在训练过程中,我们可以通过最小化它们分布函数之间的欧几里得距离来迫使这些独立的空间彼此靠近。实验结果表明,学习到的联合模型可以泛化到具有附加颜色属性的双MNIST数字的学习概念,从而能够从各自的文本数据中生成特定的彩色图像。
{"title":"Text to image generative model using constrained embedding space mapping","authors":"Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana","doi":"10.1109/MLSP.2017.8168111","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168111","url":null,"abstract":"We present a conditional generative method that maps low-dimensional embeddings of image and natural language to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. Based on this, we present a method to learn the conditional probability distribution of the two embedding spaces; first, by mapping them to a shared latent space and generating back the individual embeddings from this common space. However, in order to enable independent conditional inference for separately extracting the corresponding embeddings from the common latent space representation, we deploy a proxy variable trick — wherein, the single shared latent space is replaced by two separate latent spaces. We design an objective function, such that, during training we can force these separate spaces to lie close to each other, by minimizing the Euclidean distance between their distribution functions. Experimental results demonstrate that the learned joint model can generalize to learning concepts of double MNIST digits with additional attributes of colors, thereby enabling the generation of specific colored images from the respective text data.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"62 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85222872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
MIML-AI: Mixed-supervision multi-instance multi-label learning with auxiliary information MIML-AI:带辅助信息的混合监督多实例多标签学习
Tarn Nguyen, R. Raich, Xiaoli Z. Fern, Anh T. Pham
Manual labeling of individual instances is time-consuming. This is commonly resolved by labeling a bag-of-instances with a single common label or label-set. However, this approach is still time-costly for large datasets. In this paper, we propose a mixed-supervision multi-instance multi-label learning model for learning from easily available meta data information (MIML-AI). This auxiliary information is normally collected automatically with the data, e.g., an image location information or a document author name. We propose a discriminative graphical model with exact inferences to train a classifier based on auxiliary label information and a small number of labeled bags. This strategy utilizes meta data as means of providing a weaker label as an alternative to intensive manual labeling. Experiment on real data illustrates the effectiveness of our proposed method relative to current approaches, which do not use the information from bags that contain only meta-data label information.
手动标记单个实例非常耗时。这个问题通常通过使用单个公共标签或标签集标记实例包来解决。然而,这种方法对于大型数据集来说仍然是费时的。在本文中,我们提出了一种混合监督多实例多标签学习模型,用于从容易获得的元数据信息中学习(MIML-AI)。该辅助信息通常与数据一起自动收集,例如,图像位置信息或文档作者姓名。我们提出了一种具有精确推理的判别图形模型来训练基于辅助标签信息和少量标签袋的分类器。该策略利用元数据作为提供较弱标签的手段,作为密集手动标签的替代方法。对真实数据的实验证明了我们提出的方法相对于当前方法的有效性,这些方法不使用仅包含元数据标签信息的袋子中的信息。
{"title":"MIML-AI: Mixed-supervision multi-instance multi-label learning with auxiliary information","authors":"Tarn Nguyen, R. Raich, Xiaoli Z. Fern, Anh T. Pham","doi":"10.1109/MLSP.2017.8168107","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168107","url":null,"abstract":"Manual labeling of individual instances is time-consuming. This is commonly resolved by labeling a bag-of-instances with a single common label or label-set. However, this approach is still time-costly for large datasets. In this paper, we propose a mixed-supervision multi-instance multi-label learning model for learning from easily available meta data information (MIML-AI). This auxiliary information is normally collected automatically with the data, e.g., an image location information or a document author name. We propose a discriminative graphical model with exact inferences to train a classifier based on auxiliary label information and a small number of labeled bags. This strategy utilizes meta data as means of providing a weaker label as an alternative to intensive manual labeling. Experiment on real data illustrates the effectiveness of our proposed method relative to current approaches, which do not use the information from bags that contain only meta-data label information.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"146 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78590850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discriminating schizophrenia from normal controls using resting state functional network connectivity: A deep neural network and layer-wise relevance propagation method 利用静息状态功能网络连接区分精神分裂症和正常对照:一种深度神经网络和分层相关传播方法
Weizheng Yan, S. Plis, V. Calhoun, Shengfeng Liu, R. Jiang, T. Jiang, J. Sui
Deep learning has gained considerable attention in the scientific community, breaking benchmark records in many fields such as speech and visual recognition [1]. Motivated by extending advancement of deep learning approaches to brain imaging classification, we propose a framework, called “deep neural network (DNN)+ layer-wise relevance propagation (LRP)”, to distinguish schizophrenia patients (SZ) from healthy controls (HCs) using functional network connectivity (FNC). 1100 Chinese subjects of 7 sites are included, each with a 50∗50 FNC matrix resulted from group ICA on resting-state fMRI data. The proposed DNN+LRP not only improves classification accuracy significantly compare to four state-of-the-art classification methods (84% vs. less than 79%, 10 folds cross validation) but also enables identification of the most contributing FNC patterns related to SZ classification, which cannot be easily traced back by general DNN models. By conducting LRP, we identified the FNC patterns that exhibit the highest discriminative power in SZ classification. More importantly, when using leave-one-site-out cross validation (using 6 sites for training, 1 site for testing, 7 times in total), the cross-site classification accuracy reached 82%, suggesting high robustness and generalization performance of the proposed method, promising a wide utility in the community and great potentials for biomarker identification of brain disorders.
深度学习在科学界引起了相当大的关注,在语音和视觉识别等许多领域都打破了基准记录[1]。基于深度学习方法在脑成像分类中的扩展进展,我们提出了一个名为“深度神经网络(DNN)+分层相关传播(LRP)”的框架,利用功能网络连接(FNC)来区分精神分裂症患者(SZ)和健康对照(hc)。共纳入7个站点的1100名中国受试者,每个受试者的静息态fMRI数据的分组ICA结果为50 * 50 FNC矩阵。与四种最先进的分类方法(84% vs.低于79%,10倍交叉验证)相比,所提出的DNN+LRP不仅显著提高了分类精度,而且能够识别与SZ分类相关的最重要的FNC模式,这些模式无法通过一般DNN模型轻松追溯。通过LRP,我们发现了在SZ分类中表现出最高判别能力的FNC模式。更重要的是,当使用留一站点交叉验证(6个站点进行训练,1个站点进行测试,共7次)时,跨站点分类准确率达到82%,表明所提出的方法具有较高的鲁棒性和泛化性能,在社会上具有广泛的实用性,在脑部疾病的生物标志物鉴定方面具有很大的潜力。
{"title":"Discriminating schizophrenia from normal controls using resting state functional network connectivity: A deep neural network and layer-wise relevance propagation method","authors":"Weizheng Yan, S. Plis, V. Calhoun, Shengfeng Liu, R. Jiang, T. Jiang, J. Sui","doi":"10.1109/MLSP.2017.8168179","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168179","url":null,"abstract":"Deep learning has gained considerable attention in the scientific community, breaking benchmark records in many fields such as speech and visual recognition [1]. Motivated by extending advancement of deep learning approaches to brain imaging classification, we propose a framework, called “deep neural network (DNN)+ layer-wise relevance propagation (LRP)”, to distinguish schizophrenia patients (SZ) from healthy controls (HCs) using functional network connectivity (FNC). 1100 Chinese subjects of 7 sites are included, each with a 50∗50 FNC matrix resulted from group ICA on resting-state fMRI data. The proposed DNN+LRP not only improves classification accuracy significantly compare to four state-of-the-art classification methods (84% vs. less than 79%, 10 folds cross validation) but also enables identification of the most contributing FNC patterns related to SZ classification, which cannot be easily traced back by general DNN models. By conducting LRP, we identified the FNC patterns that exhibit the highest discriminative power in SZ classification. More importantly, when using leave-one-site-out cross validation (using 6 sites for training, 1 site for testing, 7 times in total), the cross-site classification accuracy reached 82%, suggesting high robustness and generalization performance of the proposed method, promising a wide utility in the community and great potentials for biomarker identification of brain disorders.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75381907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
A locally optimal algorithm for estimating a generating partition from an observed time series 从观测时间序列估计生成分区的局部最优算法
David J. Miller, Najah F. Ghalyan, A. Ray
Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet uniquely specifies the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. Previous work attemps to minimize a clustering objective function that measures discrepancy between a set of reconstruction values and the points from the time series. Unfortunately, the resulting algorithm is non-convergent, with no guarantee of finding even locally optimal solutions. The problem is a heuristic “nearest neighbor” symbol assignment step. Alternatively, we introduce a new, locally optimal algorithm. We apply iterative “nearest neighbor” symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the time series is achieved. While some approaches use vector quantization to partition the state space, our approach only ensures a partition in the space consisting of the entire time series (effectively, clustering in an infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. We demonstrate improvement, with respect to several measures, over a popular method used in the literature.
生成分区的估计对于离散时间动力系统测量的符号化是至关重要的,其中来自(有限基数)字母表的符号序列唯一地指定了底层时间序列。这样的符号化对于计算度量(例如,Kolmogorov-Sinai熵)来描述(可能未知的)动力系统是有用的。对于时间序列分类和异常检测也很有用。先前的工作试图最小化聚类目标函数,该函数测量一组重建值与时间序列中的点之间的差异。不幸的是,得到的算法是不收敛的,甚至不能保证找到局部最优解。该问题是一个启发式的“最近邻”符号分配步骤。或者,我们引入一种新的局部最优算法。我们采用保证差异下降的迭代“最近邻”符号分配,通过该方法实现了时间序列的联合、局部最优符号化。虽然有些方法使用矢量量化来划分状态空间,但我们的方法只确保在由整个时间序列组成的空间中进行划分(有效地,在无限维空间中聚类)。我们的方法也相当于一种新型的滑动块有损源编码。我们证明了改进,就几个措施,在文献中使用的一种流行的方法。
{"title":"A locally optimal algorithm for estimating a generating partition from an observed time series","authors":"David J. Miller, Najah F. Ghalyan, A. Ray","doi":"10.1109/MLSP.2017.8168162","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168162","url":null,"abstract":"Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet uniquely specifies the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. Previous work attemps to minimize a clustering objective function that measures discrepancy between a set of reconstruction values and the points from the time series. Unfortunately, the resulting algorithm is non-convergent, with no guarantee of finding even locally optimal solutions. The problem is a heuristic “nearest neighbor” symbol assignment step. Alternatively, we introduce a new, locally optimal algorithm. We apply iterative “nearest neighbor” symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the time series is achieved. While some approaches use vector quantization to partition the state space, our approach only ensures a partition in the space consisting of the entire time series (effectively, clustering in an infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. We demonstrate improvement, with respect to several measures, over a popular method used in the literature.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"136 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89243951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Bayesian forecasting and anomaly detection framework for vehicular monitoring networks 车辆监测网络的贝叶斯预测与异常检测框架
Maria Scalabrin, Matteo Gadaleta, Riccardo Bonetto, M. Rossi
In this paper, we are concerned with the automated and runtime analysis of vehicular data from large scale traffic monitoring networks. This problem is tackled through localized and small-size Bayesian networks (BNs), which are utilized to capture the spatio-temporal relationships underpinning traffic data from nearby road links. A dedicated BN is set up, trained, and tested for each road in the monitored geographical map. The joint probability distribution between the cause nodes and the effect node in the BN is tracked through a Gaussian Mixture Model (GMM), whose parameters are estimated via Bayesian Variational Inference (BVI). Forecasting and anomaly detection are performed on statistical measures derived at runtime by the trained GMMs. Our design choices lead to several advantages: the approach is scalable as a small-size BN is associated with and independently trained for each road and the localized nature of the framework allows flagging atypical behaviors at their point of origin in the monitored geographical map. The effectiveness of the proposed framework is tested using a large dataset from a real network deployment, comparing its prediction performance with that of selected regression algorithms from the literature, while also quantifying its anomaly detection capabilities.
在本文中,我们关注的是大规模交通监控网络中车辆数据的自动运行分析。这个问题是通过局部和小尺寸贝叶斯网络(BNs)来解决的,该网络用于捕获附近道路连接的交通数据的时空关系。为监测的地理地图中的每条道路设置、训练和测试一个专用的网络。通过高斯混合模型(GMM)跟踪网络中原因节点和效果节点之间的联合概率分布,并通过贝叶斯变分推理(BVI)估计其参数。预测和异常检测是在运行时由训练好的gmm导出的统计度量上执行的。我们的设计选择带来了几个优势:该方法是可扩展的,因为小型BN与每条道路相关联并独立训练,框架的局域性允许在监测的地理地图的起源点标记非典型行为。使用来自真实网络部署的大型数据集测试了所提出框架的有效性,将其预测性能与文献中选择的回归算法进行了比较,同时量化了其异常检测能力。
{"title":"A Bayesian forecasting and anomaly detection framework for vehicular monitoring networks","authors":"Maria Scalabrin, Matteo Gadaleta, Riccardo Bonetto, M. Rossi","doi":"10.1109/MLSP.2017.8168151","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168151","url":null,"abstract":"In this paper, we are concerned with the automated and runtime analysis of vehicular data from large scale traffic monitoring networks. This problem is tackled through localized and small-size Bayesian networks (BNs), which are utilized to capture the spatio-temporal relationships underpinning traffic data from nearby road links. A dedicated BN is set up, trained, and tested for each road in the monitored geographical map. The joint probability distribution between the cause nodes and the effect node in the BN is tracked through a Gaussian Mixture Model (GMM), whose parameters are estimated via Bayesian Variational Inference (BVI). Forecasting and anomaly detection are performed on statistical measures derived at runtime by the trained GMMs. Our design choices lead to several advantages: the approach is scalable as a small-size BN is associated with and independently trained for each road and the localized nature of the framework allows flagging atypical behaviors at their point of origin in the monitored geographical map. The effectiveness of the proposed framework is tested using a large dataset from a real network deployment, comparing its prediction performance with that of selected regression algorithms from the literature, while also quantifying its anomaly detection capabilities.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"123 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83473001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation 基于递归神经网络的半盲语音增强源分离与去噪
Masaya Wake, Yoshiaki Bando, M. Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara
This paper describes a semi-blind speech enhancement method using a semi-blind recurrent neural network (SB-RNN) for human-robot speech interaction. When a robot interacts with a human using speech signals, the robot inputs not only audio signals recorded by its own microphone but also speech signals made by the robot itself, which can be used for semi-blind speech enhancement. The SB-RNN consists of cascaded two modules: a semi-blind source separation module and a blind dereverberation module. Each module has a recurrent layer to capture the temporal correlations of speech signals. The SB-RNN is trained in a manner of multi-task learning, i.e., isolated echoic speech signals are used as teacher signals for the output of the separation module in addition to isolated unechoic signals for the output of the dereverberation module. Experimental results showed that the source to distortion ratio was improved by 2.30 dB on average compared to a conventional method based on a semi-blind independent component analysis. The results also showed the effectiveness of modularization of the network, multi-task learning, the recurrent structure, and semi-blind source separation.
本文提出了一种利用半盲递归神经网络(SB-RNN)进行人机语音交互的半盲语音增强方法。当机器人使用语音信号与人进行交互时,机器人不仅输入自身麦克风录制的音频信号,还输入机器人自身发出的语音信号,可用于半盲语音增强。SB-RNN由级联的两个模块组成:半盲源分离模块和盲去噪模块。每个模块都有一个循环层来捕获语音信号的时间相关性。SB-RNN采用多任务学习的方式进行训练,即在分离模块的输出中使用孤立的回声语音信号作为教师信号,在去噪模块的输出中使用孤立的无回声信号。实验结果表明,与基于半盲独立分量分析的传统方法相比,源失真比平均提高了2.30 dB。结果还显示了网络模块化、多任务学习、循环结构和半盲源分离的有效性。
{"title":"Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation","authors":"Masaya Wake, Yoshiaki Bando, M. Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara","doi":"10.1109/MLSP.2017.8168191","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168191","url":null,"abstract":"This paper describes a semi-blind speech enhancement method using a semi-blind recurrent neural network (SB-RNN) for human-robot speech interaction. When a robot interacts with a human using speech signals, the robot inputs not only audio signals recorded by its own microphone but also speech signals made by the robot itself, which can be used for semi-blind speech enhancement. The SB-RNN consists of cascaded two modules: a semi-blind source separation module and a blind dereverberation module. Each module has a recurrent layer to capture the temporal correlations of speech signals. The SB-RNN is trained in a manner of multi-task learning, i.e., isolated echoic speech signals are used as teacher signals for the output of the separation module in addition to isolated unechoic signals for the output of the dereverberation module. Experimental results showed that the source to distortion ratio was improved by 2.30 dB on average compared to a conventional method based on a semi-blind independent component analysis. The results also showed the effectiveness of modularization of the network, multi-task learning, the recurrent structure, and semi-blind source separation.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"98 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76982569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1