2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文中文

Using Block Coordinate Descent to Learn Sparse Coding Dictionaries with a Matrix Norm Update 基于矩阵范数更新的块坐标下降学习稀疏编码字典

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461499

Bradley M. Whitaker, David V. Anderson

Researchers have recently examined a modified approach to sparse coding that encourages dictionaries to learn anomalous features. This is done by incorporating the matrix I-norm, or $ell_{1,infty}$ mixed matrix norm, into the dictionary update portion of a sparse coding algorithm. However, solving a matrix norm minimization problem in each iteration of the algorithm causes it to run more slowly. The purpose of this paper is to introduce block coordinate descent, a subgradient-like approach to minimizing the matrix norm, to the dictionary update. This approach removes the need to solve a convex optimization program in each iteration and dramatically reduces the time required to learn a dictionary. Importantly, the dictionary learned in this manner can still model anomalous features present in a dataset.

研究人员最近研究了一种改进的稀疏编码方法，这种方法鼓励字典学习异常特征。这是通过将矩阵i -范数(或$ell_{1,infty}$混合矩阵范数)合并到稀疏编码算法的字典更新部分来实现的。然而，在每次迭代中解决矩阵范数最小化问题会导致算法运行速度变慢。本文的目的是将块坐标下降——一种类似于次梯度的最小化矩阵范数的方法引入到字典更新中。这种方法消除了在每次迭代中求解凸优化程序的需要，并大大减少了学习字典所需的时间。重要的是，以这种方式学习的字典仍然可以对数据集中存在的异常特征进行建模。

引用次数: 1

Mutual-Information-Private Online Gradient Descent Algorithm 互信息私有在线梯度下降算法

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461756

Ruochi Zhang, P. Venkitasubramaniam

A user implemented privacy preservation mechanism is proposed for the online gradient descent (OGD) algorithm. Privacy is measured through the information leakage as quantified by the mutual information between the users outputs and learners inputs. The input perturbation mechanism proposed can be implemented by individual users with a space and time complexity that is independent of the horizon T. For the proposed mechanism, the information leakage is shown to be bounded by the Gaussian channel capacity in the full information setting. The regret bound of the privacy preserving learning mechanism is identical to the non private OGD with only differing in constant factors.

针对在线梯度下降(OGD)算法，提出了一种用户实现的隐私保护机制。隐私是通过用户输出和学习者输入之间的互信息来量化的信息泄漏来衡量的。所提出的输入扰动机制可以由独立于视界t的个体用户实现，其时空复杂度与视界t无关。对于所提出的机制，在全信息设置下，信息泄漏受到高斯信道容量的限制。隐私保护学习机制的遗憾界与非隐私OGD相同，只是常数因素不同。

引用次数: 1

Dynamic Multi-Rater Gaussian Mixture Regression Incorporating Temporal Dependencies of Emotion Uncertainty Using Kalman Filters 基于卡尔曼滤波的考虑情绪不确定性时间依赖性的动态多因子高斯混合回归

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461321

T. Dang, V. Sethu, E. Ambikairajah

Predicting continuous emotion in terms of affective attributes has mainly been focused on hard labels, which ignored the ambiguity of recognizing certain emotions. This ambiguity may result in high inter-rater variability and in turn causes varying prediction uncertainty with time. Based on the assumption that temporal dependencies occur in the evolution of emotion uncertainty, this paper proposes a dynamic multi-rater Gaussian Mixture Regression (GMR), aiming to obtain the emotion uncertainty prediction reflected by multi-raters by taking into account their temporal dependencies. This framework is achieved by incorporating feedforward and backward Kalman filters into GMR to estimate the time-dependent label distribution that reflects the emotion uncertainty. It also provides the benefits of relaxing the label distribution of Gaussian assumption to that of a Gaussian Mixture Model (GMM). In addition, a new measurement to estimate emotion uncertainty from GMM as the local variability is adopted. Experiments conducted on the RECOLA database reveal that incorporating temporal dependencies is critical for emotion uncertainty prediction with 17% relative improvement for arousal, and that the proposed framework for emotion uncertainty prediction shows potential in conventional emotion attribute prediction.

从情感属性方面预测连续情绪主要集中在硬标签上，忽略了识别某些情绪的模糊性。这种模糊性可能导致较高的速率变异性，进而导致随时间变化的预测不确定性。基于情绪不确定性演化过程中存在时间依赖性的假设，本文提出了一种动态多评分者高斯混合回归(GMR)方法，旨在考虑多评分者的时间依赖性，获得多评分者反映的情绪不确定性预测。该框架是通过将前馈和后向卡尔曼滤波器结合到GMR中来估计反映情绪不确定性的时间相关标签分布来实现的。它还提供了将高斯假设的标签分布放宽到高斯混合模型(GMM)的标签分布的好处。此外，本文还采用了一种新的测量方法来估计GMM作为局部变异的情绪不确定性。在RECOLA数据库上进行的实验表明，结合时间依赖性对情绪不确定性预测至关重要，唤醒率相对提高17%，并且所提出的情绪不确定性预测框架在传统情绪属性预测中具有潜力。

{"title":"Dynamic Multi-Rater Gaussian Mixture Regression Incorporating Temporal Dependencies of Emotion Uncertainty Using Kalman Filters","authors":"T. Dang, V. Sethu, E. Ambikairajah","doi":"10.1109/ICASSP.2018.8461321","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461321","url":null,"abstract":"Predicting continuous emotion in terms of affective attributes has mainly been focused on hard labels, which ignored the ambiguity of recognizing certain emotions. This ambiguity may result in high inter-rater variability and in turn causes varying prediction uncertainty with time. Based on the assumption that temporal dependencies occur in the evolution of emotion uncertainty, this paper proposes a dynamic multi-rater Gaussian Mixture Regression (GMR), aiming to obtain the emotion uncertainty prediction reflected by multi-raters by taking into account their temporal dependencies. This framework is achieved by incorporating feedforward and backward Kalman filters into GMR to estimate the time-dependent label distribution that reflects the emotion uncertainty. It also provides the benefits of relaxing the label distribution of Gaussian assumption to that of a Gaussian Mixture Model (GMM). In addition, a new measurement to estimate emotion uncertainty from GMM as the local variability is adopted. Experiments conducted on the RECOLA database reveal that incorporating temporal dependencies is critical for emotion uncertainty prediction with 17% relative improvement for arousal, and that the proposed framework for emotion uncertainty prediction shows potential in conventional emotion attribute prediction.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"4929-4933"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88924565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Generalised Discriminative Transform via Curriculum Learning for Speaker Recognition 基于课程学习的广义判别变换用于说话人识别

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461296

E. Marchi, Stephen Shum, Kvuveon Hwang, S. Kajarekar, Siddharth Sigtia, H. Richards, R. Haynes, Yoon Kim, J. Bridle

In this paper we introduce a speaker verification system deployed on mobile devices that can be used to personalise a keyword spotter. We describe a baseline DNN system that maps an utterance to a speaker embedding, which is used to measure speaker differences via cosine similarity. We then introduce an architectural modification which uses an LSTM system where the parameters are optimised via a curriculum learning procedure to reduce the detection error and improve its generalisability across various conditions. Experiments on our internal datasets show that the proposed approach outperforms the DNN baseline system and yields a relative EER reduction of 30-70% on both text-dependent and text-independent tasks under a variety of acoustic conditions.

在本文中，我们介绍了一个部署在移动设备上的说话人验证系统，该系统可用于个性化关键字定位器。我们描述了一个基线DNN系统，该系统将话语映射到说话人嵌入，该嵌入用于通过余弦相似性测量说话人的差异。然后，我们引入了一个使用LSTM系统的架构修改，其中参数通过课程学习过程进行优化，以减少检测误差并提高其在各种条件下的通用性。在我们内部数据集上的实验表明，所提出的方法优于DNN基线系统，在各种声学条件下，文本依赖和文本独立任务的相对EER降低了30-70%。

引用次数: 19

Low Rank Fourier Ptychography 低秩傅立叶平面摄影

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8462480

Zhengyu Chen, Gauri Jagatap, Seyedehsara Nayer, C. Hegde, Namrata Vaswani

In this paper, we introduce a principled algorithmic approach for Fourier ptychographic imaging of dynamic, time-varying targets. To the best of our knowledge, this setting has not been explicitly addressed in the ptychography literature. We argue that such a setting is very natural, and that our methods provide an important first step towards helping reduce the sample complexity (and hence acquisition time) of imaging dynamic scenes to managaeble levels. With significantly reduced acquisition times per image, it is conceivable that dynamic ptychographic imaging of fast changing scenes indeeed becomes practical in the near future.

在本文中，我们介绍了一种用于动态、时变目标的傅立叶型成像的原则性算法。据我们所知，这种设置还没有明确地解决在印刷文献。我们认为，这样的设置是非常自然的，我们的方法提供了重要的第一步，以帮助减少成像动态场景的样本复杂性(因此采集时间)到可管理的水平。随着每张图像采集时间的显著减少，可以想象，在不久的将来，快速变化场景的动态平面成像确实变得实用。

引用次数: 14

MMSE Adaptive Waveform Design for a MIMO Active Sensing System Tracking Multiple Moving Targets 多运动目标跟踪MIMO主动传感系统的MMSE自适应波形设计

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8462319

Steven Herbert, J. Hopgood, B. Mulgrew

This paper proposes a method for minimum mean squared error (MMSE) adaptive waveform design (AWD) in multiple-input-multiple-output (MIMO) active sensing systems which are used to track moving targets. The method proposed herein prompts two computational improvements compared to a related method for static targets. Consideration of moving targets also introduces the possibility of ‘model mismatch’ between the actual motion of the targets, and the model available to the MMSE AWD system. Results show that the proposed method leads to an improvement in mean squared error performance of up to 29% compared to the non-adaptive case.

提出了一种多输入多输出(MIMO)主动传感系统中最小均方误差(MMSE)自适应波形设计方法。与静态目标的相关方法相比，本文提出的方法在计算上有两个改进。考虑运动目标还引入了目标实际运动与MMSE AWD系统可用模型之间“模型不匹配”的可能性。结果表明，与非自适应情况相比，该方法的均方误差性能提高了29%。

引用次数: 2

Investigating Label Noise Sensitivity of Convolutional Neural Networks for Fine Grained Audio Signal Labelling 基于卷积神经网络的细粒度音频信号标记噪声敏感性研究

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461291

Rainer Kelz, G. Widmer

We measure the effect of small amounts of systematic and random label noise caused by slightly misaligned ground truth labels in a fine grained audio signal labeling task. The task we choose to demonstrate these effects on is also known as framewise polyphonic transcription or note quantized multi-fO estimation, and transforms a monaural audio signal into a sequence of note indicator labels. It will be shown that even slight misalignments have clearly apparent effects, demonstrating a great sensitivity of convolutional neural networks to label noise. The implications are clear: when using convolutional neural networks for fine grained audio signal labeling tasks, great care has to be taken to ensure that the annotations have precise timing, and are free from systematic or random error as much as possible - even small misalignments will have a noticeable impact.

我们测量了在细粒度音频信号标记任务中由轻微不对齐的地面真值标签引起的少量系统和随机标签噪声的影响。我们选择证明这些影响的任务也被称为帧式复调转录或音符量化多fo估计，并将单音频信号转换为音符指示标签序列。我们将看到，即使是轻微的错位也会产生明显的影响，这表明卷积神经网络对标记噪声具有很高的敏感性。其含义很清楚:当使用卷积神经网络进行细粒度音频信号标记任务时，必须非常小心地确保注释具有精确的定时，并且尽可能地避免系统或随机错误——即使是很小的不对齐也会产生明显的影响。

引用次数: 3

On the Geometry of Mixtures of Prescribed Distributions 关于规定分布的混合几何

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461869

F. Nielsen, R. Nock

We consider the space of w-mixtures that are finite statistical mixtures sharing the same prescribed component distributions, like Gaussian mixture models sharing the same components. The information geometry induced by the Kullback-Leibler (KL) divergence yields a dually flat space where the KL divergence between two w-mixtures amounts to a Bregman divergence for the negative Shannon entropy generator, called the Shannon information. Furthermore, we prove that the skew Jensen-Shannon statistical divergence between w-mixtures amount to skew Jensen divergences on their parameters and state several divergence inequalities between w-mixtures and their closures.

我们考虑具有相同规定分量分布的有限统计混合w-混合物的空间，就像具有相同分量的高斯混合模型一样。由Kullback-Leibler (KL)散度引起的信息几何产生一个对偶平坦空间，其中两个w混合物之间的KL散度相当于负香农熵发生器的Bregman散度，称为香农信息。进一步证明了w-混合物之间的偏Jensen- shannon统计散度等于w-混合物参数上的偏Jensen散度，并给出了w-混合物及其闭包之间的几个散度不等式。

引用次数: 18

Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization 用确定的多通道非负矩阵分解法对混响混合物进行联合分离和去噪

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8462080

Hideaki Kagami, H. Kameoka, M. Yukawa

This paper proposes an extension of multichannel non-negative matrix factorization (MNMF) that simultaneously solves source separation and dereverberation. While MNMF was originally formulated under an underdetermined problem setting where sources can outnumber microphones, a determined counterpart of MNMF, which we call the determined MNMF (DMNMF), has recently been proposed with notable success. This approach is particularly notable in that the optimization process can be more than 30 times faster than the underdetermined version owing to the fact that it involves no matrix inversion computations. One drawback as regards all methods based on instantaneous mixture models, including MNMF, is that they are weak against long reverberation. To overcome this drawback, this paper proposes an extension of DMNMF using a frequency-domain convolutive mixture model. The optimization process of the proposed method consists of iteratively updating (i) the spectral parameters of each source using the majorization-minimization algorithm, (ii) the separation matrix using iterative projection, and (iii) the dereverberation filters using multichannel linear prediction. Experimental results showed that the proposed method yielded higher separation performance and dereverberation performance than the baseline method under highly reverberant environments.

提出了一种多通道非负矩阵分解(MNMF)的扩展，同时解决了源分离和去噪问题。虽然MNMF最初是在一个不确定的问题设置下制定的，其中源可能超过麦克风的数量，但MNMF的确定对立物，我们称之为确定MNMF (DMNMF)，最近被提出并取得了显着的成功。这种方法特别值得注意的是，由于它不涉及矩阵反演计算，因此优化过程可以比欠确定版本快30倍以上。所有基于瞬时混合模型(包括MNMF)的方法都有一个缺点，那就是它们对长混响很弱。为了克服这一缺点，本文提出了一种使用频域卷积混合模型的DMNMF扩展。该方法的优化过程包括迭代更新(i)使用最大化最小化算法对每个源的光谱参数进行更新，(ii)使用迭代投影对分离矩阵进行更新，(iii)使用多通道线性预测对去噪滤波器进行迭代更新。实验结果表明，在高混响环境下，该方法比基线方法具有更高的分离性能和去噪性能。

{"title":"Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization","authors":"Hideaki Kagami, H. Kameoka, M. Yukawa","doi":"10.1109/ICASSP.2018.8462080","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462080","url":null,"abstract":"This paper proposes an extension of multichannel non-negative matrix factorization (MNMF) that simultaneously solves source separation and dereverberation. While MNMF was originally formulated under an underdetermined problem setting where sources can outnumber microphones, a determined counterpart of MNMF, which we call the determined MNMF (DMNMF), has recently been proposed with notable success. This approach is particularly notable in that the optimization process can be more than 30 times faster than the underdetermined version owing to the fact that it involves no matrix inversion computations. One drawback as regards all methods based on instantaneous mixture models, including MNMF, is that they are weak against long reverberation. To overcome this drawback, this paper proposes an extension of DMNMF using a frequency-domain convolutive mixture model. The optimization process of the proposed method consists of iteratively updating (i) the spectral parameters of each source using the majorization-minimization algorithm, (ii) the separation matrix using iterative projection, and (iii) the dereverberation filters using multichannel linear prediction. Experimental results showed that the proposed method yielded higher separation performance and dereverberation performance than the baseline method under highly reverberant environments.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"31-35"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73599280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Using Deep Learning to Classify Power Consumption Signals of Wireless Devices: An Application to Cybersecurity 利用深度学习对无线设备功耗信号进行分类:在网络安全中的应用

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2018-04-23 DOI: 10.1109/ICASSP.2018.8461304

Abdurhman Albasir, R. S. R. James, S. Naik, A. Nayak

The problem of detecting malware in mobile devices is becoming increasingly important. While most of the mobile devices run on very limited resources, having anti-viruses installed on-board is not very practical, especially in IoT devices. Even if such tools exist, malware could hide or manipulate their fingerprint, making them not easy to detect. Thus, having effective countermeasures for after malware intrusion is paramount. In this work, we utilize deep learning ability to learn multiple levels of representations from raw data to classify power consumption signals obtained from smartphones. The objective is to build a framework that can intelligently tell if the smartphone has a malware or not by only monitoring its power consumption. Validation tests confirm that the proposed framework show that information contained in the measured power consumption of smartphones can in principle be used to identify malware existence and further can tell how active malware is with very high accuracy.

在移动设备中检测恶意软件的问题变得越来越重要。虽然大多数移动设备运行在非常有限的资源上，但在板上安装防病毒软件并不是很实用，特别是在物联网设备中。即使存在这样的工具，恶意软件也可以隐藏或操纵他们的指纹，使他们不容易被发现。因此，对恶意软件入侵后的有效对策至关重要。在这项工作中，我们利用深度学习能力从原始数据中学习多层表示，以分类从智能手机获得的功耗信号。目标是建立一个框架，可以智能地判断智能手机是否有恶意软件，仅通过监测其功耗。验证测试证实，所提出的框架表明，智能手机功耗测量中包含的信息原则上可以用于识别恶意软件的存在，并且可以以非常高的准确性告诉恶意软件的活跃程度。

引用次数: 7

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀