首页 > 最新文献

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

英文 中文
A diagonal plus low-rank covariance model for computationally efficient source separation 一个对角线加低秩协方差模型计算有效的源分离
A. Liutkus, Kazuyoshi Yoshii
This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.
提出了一种加速版的正半定张量分解法(PSDTF)用于盲源分离。PSDTF的工作优于非负矩阵分解(NMF),因为它放弃了音频信号可以通过短期傅里叶变换(STFT)在频域白化的假设。事实上,这个假设只在一种理想情况下成立,即每一帧都是无限长的,目标信号在每一帧中都是完全静止的。因此,PSDTF处理频率箱上的完整协方差矩阵,而不是像NMF那样强迫它们是对角的。尽管PSDTF在分离性能上明显优于NMF,但由于大协方差矩阵的重复反演,它的计算成本很高。为了解决这一问题,我们提出了一种基于对角加低秩协方差矩阵的中间模型,并推导了有效更新PSDTF参数的期望最大化(EM)算法。实验结果表明,我们的方法可以在不显著降低分离性能的情况下,将PSDTF的复杂度显著降低几个数量级。
{"title":"A diagonal plus low-rank covariance model for computationally efficient source separation","authors":"A. Liutkus, Kazuyoshi Yoshii","doi":"10.1109/MLSP.2017.8168169","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168169","url":null,"abstract":"This paper presents an accelerated version of positive semidefinite tensor factorization (PSDTF) for blind source separation. PSDTF works better than nonnegative matrix factorization (NMF) by dropping the arguable assumption that audio signals can be whitened in the frequency domain by using short-term Fourier transform (STFT). Indeed, this assumption only holds true in an ideal situation where each frame is infinitely long and the target signal is completely stationary in each frame. PSDTF thus deals with full covariance matrices over frequency bins instead of forcing them to be diagonal as in NMF. Although PSDTF significantly outperforms NMF in terms of separation performance, it suffers from a heavy computational cost due to the repeated inversion of big covariance matrices. To solve this problem, we propose an intermediate model based on diagonal plus low-rank covariance matrices and derive the expectation-maximization (EM) algorithm for efficiently updating the parameters of PSDTF. Experimental results showed that our method can dramatically reduce the complexity of PSDTF by several orders of magnitude without a significant decrease in separation performance.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"34 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88310167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models 基于基函数的混合噪声信号的生成及基于dnn的语音增强模型的学习
Shi-Xue Wen, Jun Du, Chin-Hui Lee
We first examine the generalization issue with the noise samples used in training nonlinear mapping functions between noisy and clean speech features for deep neural network (DNN) based speech enhancement. Then an empirical proof is established to explain why the DNN-based approach has a good noise generalization capability provided that a large collection of noise types are included in generating diverse noisy speech samples for training. It is shown that an arbitrary noise signal segment can be well represented by a linear combination of microstructure noise bases. Accordingly, we propose to generate these mixing noise signals by designing a set of compact and analytic noise bases without using any realistic noise types. The experiments demonstrate that this noise generation scheme can yield comparable performance to that using 50 real noise types. Furthermore, by supplementing the collected noise types with the synthesized noise bases, we observe remarkable performance improvements implying that not only a large collection of real-world noise signals can be alleviated, but also a good noise generalization capability can be achieved.
我们首先研究了基于深度神经网络(DNN)的语音增强中用于训练噪声和干净语音特征之间非线性映射函数的噪声样本的泛化问题。然后建立了一个经验证明来解释为什么基于dnn的方法具有良好的噪声泛化能力,前提是在生成用于训练的各种噪声语音样本时包含大量噪声类型。结果表明,任意噪声信号段都可以用微结构噪声基的线性组合来表示。因此,我们建议在不使用任何实际噪声类型的情况下,通过设计一套紧凑的解析噪声基来产生这些混合噪声信号。实验表明,该噪声生成方案与使用50种真实噪声类型的噪声生成方案具有相当的性能。此外,通过将收集到的噪声类型与合成的噪声基相补充,我们观察到显著的性能改进,这意味着不仅可以减轻大量真实噪声信号的收集,而且可以实现良好的噪声泛化能力。
{"title":"On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models","authors":"Shi-Xue Wen, Jun Du, Chin-Hui Lee","doi":"10.1109/MLSP.2017.8168192","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168192","url":null,"abstract":"We first examine the generalization issue with the noise samples used in training nonlinear mapping functions between noisy and clean speech features for deep neural network (DNN) based speech enhancement. Then an empirical proof is established to explain why the DNN-based approach has a good noise generalization capability provided that a large collection of noise types are included in generating diverse noisy speech samples for training. It is shown that an arbitrary noise signal segment can be well represented by a linear combination of microstructure noise bases. Accordingly, we propose to generate these mixing noise signals by designing a set of compact and analytic noise bases without using any realistic noise types. The experiments demonstrate that this noise generation scheme can yield comparable performance to that using 50 real noise types. Furthermore, by supplementing the collected noise types with the synthesized noise bases, we observe remarkable performance improvements implying that not only a large collection of real-world noise signals can be alleviated, but also a good noise generalization capability can be achieved.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"110 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75628645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Partitioning in signal processing using the object migration automaton and the pursuit paradigm 用对象迁移自动机和追踪范式划分信号处理
Abdolreza Shirvani, B. Oommen
Data in all Signal Processing (SP) applications is being generated super-exponentially, and at an ever increasing rate. A meaningful way to pre-process it so as to achieve feasible computation is by Partitioning the data [5]. Indeed, the task of partitioning is one of the most difficult problems in computing, and it has extensive applications in solving real-life problems, especially when the amount of SP data (i.e., images, voices, speakers, libraries etc.) to be processed is prohibitively large. The problem is known to be NP-hard. The benchmark solution for this for the Equi-partitioning Problem (EPP) has involved the classic field of Learning Automata (LA), and the corresponding algorithm, the Object Migrating Automata (OMA) has been used in numerous application domains. While the OMA is a fixed structure machine, it does not incorporate the Pursuit concept that has, recently, significantly enhanced the field of LA. In this paper, we pioneer the incorporation of the Pursuit concept into the OMA. We do this by a non-intuitive paradigm, namely that of removing (or discarding) from the query stream, queries that could be counter-productive. This can be perceived as a filtering agent triggered by a pursuit-based module. The resulting machine, referred to as the Pursuit OMA (POMA), has been rigorously tested in all the standard benchmark environments. Indeed, in certain extreme environments it is almost ten times faster than the original OMA. The application of the POMA to all signal processing applications is extremely promising.
所有信号处理(SP)应用中的数据都在以超级指数级的速度增长。对数据进行预处理以实现可行的计算是一种有意义的方法[5]。事实上,分区任务是计算中最困难的问题之一,它在解决现实问题方面有广泛的应用,特别是当要处理的SP数据(即图像、声音、扬声器、库等)的数量非常大时。这个问题被称为NP-hard。针对等分割问题(EPP)的基准解决方案涉及到学习自动机(LA)的经典领域,而相应的算法——对象迁移自动机(OMA)已经在许多应用领域得到了应用。虽然OMA是一个固定结构的机器,但它并没有融入最近在LA领域得到显著提升的Pursuit概念。在本文中,我们率先将追求概念纳入OMA。我们通过一种非直观的范例来做到这一点,即从查询流中删除(或丢弃)可能适得其反的查询。这可以看作是由基于追踪的模块触发的过滤代理。生成的机器称为Pursuit OMA (POMA),已经在所有标准基准测试环境中进行了严格的测试。事实上,在某些极端环境下,它的速度几乎是原始OMA的十倍。POMA在所有信号处理应用中的应用是非常有前途的。
{"title":"Partitioning in signal processing using the object migration automaton and the pursuit paradigm","authors":"Abdolreza Shirvani, B. Oommen","doi":"10.1109/MLSP.2017.8168149","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168149","url":null,"abstract":"Data in all Signal Processing (SP) applications is being generated super-exponentially, and at an ever increasing rate. A meaningful way to pre-process it so as to achieve feasible computation is by Partitioning the data [5]. Indeed, the task of partitioning is one of the most difficult problems in computing, and it has extensive applications in solving real-life problems, especially when the amount of SP data (i.e., images, voices, speakers, libraries etc.) to be processed is prohibitively large. The problem is known to be NP-hard. The benchmark solution for this for the Equi-partitioning Problem (EPP) has involved the classic field of Learning Automata (LA), and the corresponding algorithm, the Object Migrating Automata (OMA) has been used in numerous application domains. While the OMA is a fixed structure machine, it does not incorporate the Pursuit concept that has, recently, significantly enhanced the field of LA. In this paper, we pioneer the incorporation of the Pursuit concept into the OMA. We do this by a non-intuitive paradigm, namely that of removing (or discarding) from the query stream, queries that could be counter-productive. This can be perceived as a filtering agent triggered by a pursuit-based module. The resulting machine, referred to as the Pursuit OMA (POMA), has been rigorously tested in all the standard benchmark environments. Indeed, in certain extreme environments it is almost ten times faster than the original OMA. The application of the POMA to all signal processing applications is extremely promising.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"440 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73598761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization 判别非负矩阵分解的广义倒谱正则化
Li Li, H. Kameoka, S. Makino
The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.
非负矩阵分解(NMF)方法已被证明在单语言语音增强任务中工作得相当好。本文提出解决原NMF方法的两个缺点:(1)基训练和分离(维纳滤波)的目标函数不一致(基谱未经过训练,分离后的信号成为最优);(2)最小化谱散度措施并不一定会导致特征域(例如,倒谱域)或感知质量的增强。为了解决第一个缺点,我们之前提出了一种判别NMF (DNMF)算法,该算法为基础训练和分离优化相同的目标。为了解决第二个缺点,我们之前引入了新的框架,称为倒谱距离正则化NMF (CDRNMF)和mel-广义倒谱距离正则化NMF (MGCRNMF),其目的是在谱域和特征域增强语音。本文通过将MGC正则化器引入DNMF目标函数,提出DNMF和MGCRNMF目标的结合,并提出了一种参数估计算法。实验结果表明,该方法优于基线方法。
{"title":"Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization","authors":"Li Li, H. Kameoka, S. Makino","doi":"10.1109/MLSP.2017.8168142","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168142","url":null,"abstract":"The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"2014 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88132290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigation-Based learning for survey trajectory classification in autonomous underwater vehicles 基于导航学习的自主水下航行器测量轨迹分类
M. D. L. Alvarez, H. Hastie, D. Lane
Timeseries sensor data processing is indispensable for system monitoring. Working with autonomous vehicles requires mechanisms that provide insightful information about the status of a mission. In a setting where time and resources are limited, trajectory classification plays a vital role in mission monitoring and failure detection. In this context, we use navigational data to interpret trajectory patterns and classify them. We implement Long Short-Term Memory (LSTM) based Recursive Neural Networks (RNN) that learn the most commonly used survey trajectory patterns from surveys executed by two types of Autonomous Underwater Vehicles (AUV). We compare the performance of our network against baseline machine learning methods.
时间序列传感器数据处理是系统监测必不可少的环节。与自动驾驶汽车合作需要提供有关任务状态的深刻信息的机制。在时间和资源有限的情况下,弹道分类在任务监测和故障检测中起着至关重要的作用。在这种情况下,我们使用导航数据来解释轨迹模式并对它们进行分类。我们实现了基于长短期记忆(LSTM)的递归神经网络(RNN),该网络从两种类型的自主水下航行器(AUV)执行的调查中学习最常用的调查轨迹模式。我们将网络的性能与基准机器学习方法进行比较。
{"title":"Navigation-Based learning for survey trajectory classification in autonomous underwater vehicles","authors":"M. D. L. Alvarez, H. Hastie, D. Lane","doi":"10.1109/MLSP.2017.8168137","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168137","url":null,"abstract":"Timeseries sensor data processing is indispensable for system monitoring. Working with autonomous vehicles requires mechanisms that provide insightful information about the status of a mission. In a setting where time and resources are limited, trajectory classification plays a vital role in mission monitoring and failure detection. In this context, we use navigational data to interpret trajectory patterns and classify them. We implement Long Short-Term Memory (LSTM) based Recursive Neural Networks (RNN) that learn the most commonly used survey trajectory patterns from surveys executed by two types of Autonomous Underwater Vehicles (AUV). We compare the performance of our network against baseline machine learning methods.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"43 4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80433944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Fast algorithm using summed area tables with unified layer performing convolution and average pooling 快速算法使用求和面积表与统一层执行卷积和平均池化
Akihiko Kasagi, T. Tabaru, H. Tamura
Convolutional neural networks (CNNs), in which several convolutional layers extract feature patterns from an input image, are one of the most popular network architectures used for image classification. The convolutional computation, however, requires a high computational cost, resulting in an increased power consumption and processing time. In this paper, we propose a novel algorithm that substitutes a single layer for a pair formed by a convolutional layer and the following average-pooling layer. The key idea of the proposed scheme is to compute the output of the pair of original layers without the computation of convolution. To achieve this end, our algorithm generates summed area tables (SATs) of input images first and directly computes the output values from the SATs. We implemented our algorithm for forward propagation and backward propagation to evaluate the performance. Our experimental results showed that our algorithm achieved 17.1 times faster performance than the original algorithm for the same parameter used in ResNet-34.
卷积神经网络(cnn)是最流行的用于图像分类的网络体系结构之一,其中几个卷积层从输入图像中提取特征模式。然而,卷积计算需要很高的计算成本,从而导致功耗和处理时间的增加。在本文中,我们提出了一种新的算法,用一个单层代替由卷积层和下面的平均池化层组成的一对。该方案的关键思想是在不计算卷积的情况下计算原始层对的输出。为了实现这一目的,我们的算法首先生成输入图像的求和面积表(SATs),并直接计算SATs的输出值。我们实现了前向传播和后向传播算法来评估性能。实验结果表明,在ResNet-34中使用的相同参数下,我们的算法比原始算法的性能提高了17.1倍。
{"title":"Fast algorithm using summed area tables with unified layer performing convolution and average pooling","authors":"Akihiko Kasagi, T. Tabaru, H. Tamura","doi":"10.1109/MLSP.2017.8168154","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168154","url":null,"abstract":"Convolutional neural networks (CNNs), in which several convolutional layers extract feature patterns from an input image, are one of the most popular network architectures used for image classification. The convolutional computation, however, requires a high computational cost, resulting in an increased power consumption and processing time. In this paper, we propose a novel algorithm that substitutes a single layer for a pair formed by a convolutional layer and the following average-pooling layer. The key idea of the proposed scheme is to compute the output of the pair of original layers without the computation of convolution. To achieve this end, our algorithm generates summed area tables (SATs) of input images first and directly computes the output values from the SATs. We implemented our algorithm for forward propagation and backward propagation to evaluate the performance. Our experimental results showed that our algorithm achieved 17.1 times faster performance than the original algorithm for the same parameter used in ResNet-34.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"15 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84937810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Automatic plant identification using stem automata 使用茎自动机的自动植物识别
Kan Li, Ying Ma, J. Príncipe
In this paper, we propose a novel approach to automatically identify plant species using dynamics of plant growth and development or spatiotemporal evolution model (STEM). The online kernel adaptive autoregressive-moving-average (KAARMA) algorithm, a discrete-time dynamical system in the kernel reproducing Hilbert space (RKHS), is used to learn plant-development syntactic patterns from feature-vector sequences automatically extracted from 2D plant images, generated by stochastic L-systems. Results show multiclass KAARMA STEM can automatically identify plant species based on growth patterns. Furthermore, finite state machines extracted from trained KAARMA STEM retains competitive performance and are robust to noise. Automatically constructing an L-system or formal grammar to replicate a spatiotemporal structure is an open problem. This is an important first step to not only identify plants but also to generate realistic plant models automatically from observations.
本文提出了一种利用植物生长发育动态或时空演化模型(STEM)自动识别植物物种的新方法。在线核自适应自回归移动平均(KAARMA)算法是核再现希尔伯特空间(RKHS)中的一个离散时间动力系统,用于从随机l系统生成的二维植物图像中自动提取的特征向量序列中学习植物发育语法模式。结果表明,多类KAARMA STEM能够基于生长模式自动识别植物物种。此外,从训练好的KAARMA STEM中提取的有限状态机保持了竞争性能,并且对噪声具有鲁棒性。自动构建l系统或形式语法来复制时空结构是一个开放的问题。这是重要的第一步,不仅可以识别植物,而且可以根据观测自动生成真实的植物模型。
{"title":"Automatic plant identification using stem automata","authors":"Kan Li, Ying Ma, J. Príncipe","doi":"10.1109/MLSP.2017.8168147","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168147","url":null,"abstract":"In this paper, we propose a novel approach to automatically identify plant species using dynamics of plant growth and development or spatiotemporal evolution model (STEM). The online kernel adaptive autoregressive-moving-average (KAARMA) algorithm, a discrete-time dynamical system in the kernel reproducing Hilbert space (RKHS), is used to learn plant-development syntactic patterns from feature-vector sequences automatically extracted from 2D plant images, generated by stochastic L-systems. Results show multiclass KAARMA STEM can automatically identify plant species based on growth patterns. Furthermore, finite state machines extracted from trained KAARMA STEM retains competitive performance and are robust to noise. Automatically constructing an L-system or formal grammar to replicate a spatiotemporal structure is an open problem. This is an important first step to not only identify plants but also to generate realistic plant models automatically from observations.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"21 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78059955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A comparative study of example-guided audio source separation approaches based on nonnegative matrix factorization 基于非负矩阵分解的实例引导音频源分离方法的比较研究
A. Ozerov, Srdan Kitic, P. Pérez
We consider example-guided audio source separation approaches, where the audio mixture to be separated is supplied with source examples that are assumed matching the sources in the mixture both in frequency and time. These approaches were successfully applied to the tasks such as source separation by humming, score-informed music source separation, and music source separation guided by covers. Most of proposed methods are based on nonnegative matrix factorization (NMF) and its variants, including methods using NMF models pre-trained from examples as an initialization of mixture NMF decomposition, methods using those models as hyperparameters of priors of mixture NMF decomposition, and methods using coupled NMF models. Moreover, those methods differ by the choice of the NMF divergence and the NMF prior. However, there is no systematic comparison of all these methods. In this work, we compare existing methods and some new variants on the score-informed and cover-guided source separation tasks.
我们考虑了示例引导的音频源分离方法,其中要分离的音频混合提供了假设在频率和时间上与混合中的源匹配的源示例。这些方法成功地应用于嗡嗡声源分离、乐谱通知音乐源分离和封面引导音乐源分离等任务。目前提出的方法大多基于非负矩阵分解(NMF)及其变体,包括使用从样本中预训练的NMF模型作为混合NMF分解的初始化方法、使用这些模型作为混合NMF分解先验的超参数方法以及使用耦合NMF模型的方法。此外,这些方法在NMF散度和NMF先验的选择上也有所不同。然而,这些方法并没有系统的比较。在这项工作中,我们比较了现有的方法和一些新的变体在分数通知和覆盖引导的源分离任务。
{"title":"A comparative study of example-guided audio source separation approaches based on nonnegative matrix factorization","authors":"A. Ozerov, Srdan Kitic, P. Pérez","doi":"10.1109/MLSP.2017.8168196","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168196","url":null,"abstract":"We consider example-guided audio source separation approaches, where the audio mixture to be separated is supplied with source examples that are assumed matching the sources in the mixture both in frequency and time. These approaches were successfully applied to the tasks such as source separation by humming, score-informed music source separation, and music source separation guided by covers. Most of proposed methods are based on nonnegative matrix factorization (NMF) and its variants, including methods using NMF models pre-trained from examples as an initialization of mixture NMF decomposition, methods using those models as hyperparameters of priors of mixture NMF decomposition, and methods using coupled NMF models. Moreover, those methods differ by the choice of the NMF divergence and the NMF prior. However, there is no systematic comparison of all these methods. In this work, we compare existing methods and some new variants on the score-informed and cover-guided source separation tasks.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81634470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Correntropy induced metric based common spatial patterns 基于共同空间模式的相关熵诱导度量
J. Dong, Badong Chen, N. Lu, Haixian Wang, Nanning Zheng
Common spatial patterns (CSP) is a widely used method in the field of electroencephalogram (EEG) signal processing. The goal of CSP is to find spatial filters that maximize the ratio between the variances of two classes. The conventional CSP is however sensitive to outliers because it is based on the L2-norm. Inspired by the correntropy induced metric (CIM), we propose in this work a new algorithm, called CIM based CSP (CSP-CIM), to improve the robustness of CSP with respect to outliers. The CSP-CIM searches the optimal solution by a simple gradient based iterative algorithm. A toy example and a real EEG dataset are used to demonstrate the desirable performance of the new method.
共同空间模式(CSP)是脑电图(EEG)信号处理领域中应用广泛的一种方法。CSP的目标是找到最大化两个类的方差之比的空间过滤器。然而,传统的CSP对异常值很敏感,因为它是基于l2规范的。受相关熵诱导度量(CIM)的启发,我们提出了一种新的算法,称为基于CIM的CSP (CSP-CIM),以提高CSP对异常值的鲁棒性。CSP-CIM通过一种简单的基于梯度的迭代算法来搜索最优解。通过一个玩具样例和一个真实的脑电数据集来验证新方法的良好性能。
{"title":"Correntropy induced metric based common spatial patterns","authors":"J. Dong, Badong Chen, N. Lu, Haixian Wang, Nanning Zheng","doi":"10.1109/MLSP.2017.8168132","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168132","url":null,"abstract":"Common spatial patterns (CSP) is a widely used method in the field of electroencephalogram (EEG) signal processing. The goal of CSP is to find spatial filters that maximize the ratio between the variances of two classes. The conventional CSP is however sensitive to outliers because it is based on the L2-norm. Inspired by the correntropy induced metric (CIM), we propose in this work a new algorithm, called CIM based CSP (CSP-CIM), to improve the robustness of CSP with respect to outliers. The CSP-CIM searches the optimal solution by a simple gradient based iterative algorithm. A toy example and a real EEG dataset are used to demonstrate the desirable performance of the new method.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81882011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Texture classification from single uncalibrated images: Random matrix theory approach 单张未校准图像的纹理分类:随机矩阵理论方法
E. Nadimi, J. Herp, M. M. Buijs, V. Blanes-Vidal
We studied the problem of classifying textured-materials from their single-imaged appearance, under general viewing and illumination conditions, using the theory of random matrices. To evaluate the performance of our algorithm, two distinct databases of images were used: The CUReT database and our database of colorectal polyp images collected from patients undergoing colon capsule endoscopy for early cancer detection. During the learning stage, our classifier algorithm established the universality laws for the empirical spectral density of the largest singular value and normalized largest singular value of the image intensity matrix adapted to the eigenvalues of the information-plus-noise model. We showed that these two densities converge to the generalized extreme value (GEV-Frechet) and Gaussian G1 distribution with rate O(N1/2), respectively. To validate the algorithm, we introduced a set of unseen images to the algorithm. Misclassification rate of approximately 1%–6%, depending on the database, was obtained, which is superior to the reported values of 5%–45% in previous research studies.
我们研究了在一般视觉和光照条件下,利用随机矩阵理论从纹理材料的单图像外观进行分类的问题。为了评估我们的算法的性能,我们使用了两个不同的图像数据库:CUReT数据库和我们的结肠息肉图像数据库,这些图像来自于接受结肠胶囊内窥镜检查以进行早期癌症检测的患者。在学习阶段,我们的分类器算法建立了适应于信息加噪声模型特征值的图像强度矩阵的最大奇异值和归一化最大奇异值的经验谱密度的通用性规律。我们证明了这两个密度分别收敛于广义极值(GEV-Frechet)和高斯G1分布,速率为0 (N1/2)。为了验证算法,我们引入了一组未见过的图像到算法中。根据数据库的不同,得到的误分类率约为1%-6%,优于以往研究报告的5%-45%。
{"title":"Texture classification from single uncalibrated images: Random matrix theory approach","authors":"E. Nadimi, J. Herp, M. M. Buijs, V. Blanes-Vidal","doi":"10.1109/MLSP.2017.8168115","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168115","url":null,"abstract":"We studied the problem of classifying textured-materials from their single-imaged appearance, under general viewing and illumination conditions, using the theory of random matrices. To evaluate the performance of our algorithm, two distinct databases of images were used: The CUReT database and our database of colorectal polyp images collected from patients undergoing colon capsule endoscopy for early cancer detection. During the learning stage, our classifier algorithm established the universality laws for the empirical spectral density of the largest singular value and normalized largest singular value of the image intensity matrix adapted to the eigenvalues of the information-plus-noise model. We showed that these two densities converge to the generalized extreme value (GEV-Frechet) and Gaussian G1 distribution with rate O(N1/2), respectively. To validate the algorithm, we introduced a set of unseen images to the algorithm. Misclassification rate of approximately 1%–6%, depending on the database, was obtained, which is superior to the reported values of 5%–45% in previous research studies.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"34 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87945035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1