ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文中文

Neural Oracle Search on N-BEST Hypotheses 基于N-BEST假设的神经Oracle搜索

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054745

Ehsan Variani, Tongzhou Chen, J. Apfel, B. Ramabhadran, Seungjin Lee, P. Moreno

In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for each audio-hypothesis pair that is obtained by integrating information from multiple sources, such as the input acoustic representations, N-best hypotheses, additional 1st-pass statistics, and unpaired textual information through an external language model. These scores are then used to map the search problem of identifying the most likely hypothesis to a sequence classification problem. The definition of the proposed algorithm is broad enough to allow its use as an alternative to beam search in the 1st-pass or as a 2nd-pass, rescoring step. This algorithm achieves up to 12% relative reductions in Word Error Rate (WER) across several languages over state-of-the-art baselines with relatively few additional parameters. We also propose the use of a binary classifier gating function that can learn to trigger the 2nd-pass neural search model when the 1-best hypothesis is not the oracle hypothesis, thereby avoiding extra computation.

在本文中，我们提出了一种神经搜索算法，使用一系列声学表示和多个假设作为输入来选择最可能的假设。该算法为每个音频-假设对提供序列级评分，这些音频-假设对是通过整合来自多个来源的信息获得的，例如输入声学表示、n个最佳假设、额外的第一次通过统计数据，以及通过外部语言模型获得的未配对文本信息。然后使用这些分数将识别最可能假设的搜索问题映射到序列分类问题。所提出的算法的定义足够广泛，可以作为波束搜索的替代方案，在第一遍或第二遍重新记录步骤中使用。该算法在最先进的基线上，在相对较少的额外参数下，在几种语言之间的单词错误率(WER)相对降低了12%。我们还建议使用一个二元分类器门控函数，当第一最佳假设不是oracle假设时，它可以学习触发第二次神经搜索模型，从而避免额外的计算。

{"title":"Neural Oracle Search on N-BEST Hypotheses","authors":"Ehsan Variani, Tongzhou Chen, J. Apfel, B. Ramabhadran, Seungjin Lee, P. Moreno","doi":"10.1109/ICASSP40776.2020.9054745","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054745","url":null,"abstract":"In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for each audio-hypothesis pair that is obtained by integrating information from multiple sources, such as the input acoustic representations, N-best hypotheses, additional 1st-pass statistics, and unpaired textual information through an external language model. These scores are then used to map the search problem of identifying the most likely hypothesis to a sequence classification problem. The definition of the proposed algorithm is broad enough to allow its use as an alternative to beam search in the 1st-pass or as a 2nd-pass, rescoring step. This algorithm achieves up to 12% relative reductions in Word Error Rate (WER) across several languages over state-of-the-art baselines with relatively few additional parameters. We also propose the use of a binary classifier gating function that can learn to trigger the 2nd-pass neural search model when the 1-best hypothesis is not the oracle hypothesis, thereby avoiding extra computation.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"50 1","pages":"7824-7828"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89039762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

One-Bit Compressed Sensing Using Generative Models 使用生成模型的位压缩感知

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054212

Geethu Joseph, Swatantra Kafle, P. Varshney

In this paper, we address the classical problem of one-bit compressed sensing. We present a deep learning based reconstruction algorithm that relies on a generative model. The generator which is a neural network, learns a mapping from a low dimensional space to a higher dimensional set comprising of sparse vectors. This pre-trained generator is used to reconstruct sparse vectors from their one-bit measurements by searching over the range of the generator. Hence, the algorithm presented in this paper provides excellent reconstruction accuracy by accounting for any other possible structure in the signal apart from sparsity. Further, we provide theoretical guarantees on the reconstruction accuracy of the presented algorithm. Using numerical results, we also demonstrate the efficacy of our algorithm compared to other existing algorithms.

在本文中，我们解决了一个经典的比特压缩感知问题。我们提出了一种基于深度学习的重建算法，该算法依赖于生成模型。生成器是一个神经网络，学习从低维空间到由稀疏向量组成的高维集合的映射。该预训练的生成器通过在生成器的范围内搜索稀疏向量的一比特测量值来重建稀疏向量。因此，本文提出的算法通过考虑信号中除稀疏性之外的任何其他可能的结构，提供了出色的重建精度。此外，我们还为该算法的重建精度提供了理论保证。通过数值结果，我们也证明了该算法与其他现有算法相比的有效性。

引用次数: 3

Sequential Methods for Detecting a Change in the Distribution of an Episodic Process 检测情景过程分布变化的顺序方法

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054529

T. Banerjee, Edmond Adib, A. Taha, E. John

A new class of stochastic processes called episodic processes is introduced to model the statistical regularity of data observed in several applications in cyberphysical systems, neuroscience, and medicine. Algorithms are proposed to detect a change in the distribution of episodic processes. The algorithms can be computed recursively using finite memory and are shown to be asymptotically optimal for well-defined Bayesian or minimax stochastic optimization formulations. The application of the developed algorithms to detect a change in waveform patterns is also discussed.

引入了一类新的随机过程，称为情景过程，以模拟在网络物理系统，神经科学和医学中的几种应用中观察到的数据的统计规律性。提出了一种算法来检测情景过程分布的变化。该算法可以使用有限内存递归计算，并且对于定义良好的贝叶斯或极小极大随机优化公式显示为渐近最优。本文还讨论了所开发算法在检测波形模式变化方面的应用。

引用次数: 0

Acu-Net: A 3D Attention Context U-Net for Multiple Sclerosis Lesion Segmentation Acu-Net:用于多发性硬化症病灶分割的三维注意上下文U-Net

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054616

Chuan Hu, Guixia Kang, Beibei Hou, Yiyuan Ma, F. Labeau, Zichen Su

Multiple Sclerosis (MS) lesion segmentation from MR images is important for neuroimaging analysis. MS is diffuse, multifocal, and tend to involve peripheral brain structures such as the white matter, corpus callosum, and brainstem. Recently, U-Net has made great achievements in medical image segmentation area. However, the insufficiently use of context information and feature representation, makes it fail to achieve segmentation of MS lesions accurately. To solve the problem, 3D attention context U-Net (ACU-Net) is proposed for MS lesion segmentation in this paper. The proposed ACU-Net includes 3D spatial attention block, which is used to enrich spatial details and feature representation of lesion in the decoding stage. Furthermore, in the encoding and decoding stage of the network, 3D context guided module is designed for guiding local information and surrounding information. The proposed ACU-Net was evaluated on the ISBI 2015 longitudinal MS lesion segmentation challenge dataset, and it achieved superior performance compared to latest approaches.

多发性硬化症(MS)病灶分割是神经影像学分析的重要内容。多发性硬化症是弥漫性、多灶性的，往往累及脑外周结构，如白质、胼胝体和脑干。近年来，优网在医学图像分割领域取得了很大的成就。然而，由于上下文信息和特征表示的利用不足，使得它无法准确地实现MS病变的分割。为了解决这一问题，本文提出了一种用于多发性硬化症病灶分割的三维注意上下文U-Net (ACU-Net)方法。本文提出的ACU-Net包括三维空间注意块，用于丰富解码阶段病变的空间细节和特征表示。在网络的编解码阶段，设计了三维语境引导模块，对局部信息和周围信息进行引导。在ISBI 2015纵向MS病变分割挑战数据集上对所提出的ACU-Net进行了评估，与最新方法相比，该方法取得了更好的性能。

{"title":"Acu-Net: A 3D Attention Context U-Net for Multiple Sclerosis Lesion Segmentation","authors":"Chuan Hu, Guixia Kang, Beibei Hou, Yiyuan Ma, F. Labeau, Zichen Su","doi":"10.1109/ICASSP40776.2020.9054616","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054616","url":null,"abstract":"Multiple Sclerosis (MS) lesion segmentation from MR images is important for neuroimaging analysis. MS is diffuse, multifocal, and tend to involve peripheral brain structures such as the white matter, corpus callosum, and brainstem. Recently, U-Net has made great achievements in medical image segmentation area. However, the insufficiently use of context information and feature representation, makes it fail to achieve segmentation of MS lesions accurately. To solve the problem, 3D attention context U-Net (ACU-Net) is proposed for MS lesion segmentation in this paper. The proposed ACU-Net includes 3D spatial attention block, which is used to enrich spatial details and feature representation of lesion in the decoding stage. Furthermore, in the encoding and decoding stage of the network, 3D context guided module is designed for guiding local information and surrounding information. The proposed ACU-Net was evaluated on the ISBI 2015 longitudinal MS lesion segmentation challenge dataset, and it achieved superior performance compared to latest approaches.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1384-1388"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80543003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Eeg Connectivity - Informed Cooperative Adaptive Line Enhancer for Recognition of Brain State 脑电连接-基于信息的协同自适应线增强器的脑状态识别

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9052923

S. Sanei, C. C. Took, D. Jarchi, A. Procházka

Bursts of sleep spindles and paroxysmal fast brain activity waveforms have frequency overlap whilst generally, paroxysmal waveforms have shorter duration than spindles. Both resemble bursts of normal alpha activity during short rests while awake with closed eyes. In this paper, it is shown that for a proposed cooperative adaptive line enhancer, which can both detect and separate such periodic bursts, the combination weights are consistently different from each other. The outcome suggests that for accurate modelling of the brain neuro-generators, the brain connectivity has to be precisely estimated and plugged into the adaptation process.

睡眠纺锤波的爆发和阵发性快速脑活动波形有频率重叠，而通常情况下，阵发性波形的持续时间比纺锤波短。两者都类似于闭着眼睛清醒时短暂休息时的正常α活动爆发。本文的研究表明，对于一种既能检测又能分离此类周期突发的合作自适应线增强器，其组合权值始终不同。这一结果表明，为了准确地模拟脑神经发生器，必须精确地估计大脑的连通性，并将其纳入适应过程。

引用次数: 0

One-Shot Voice Conversion by Vector Quantization 基于矢量量化的单次语音转换

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053854

Da-Yi Wu, Hung-yi Lee

In this paper, we propose a vector quantization (VQ) based one-shot voice conversion (VC) approach without any supervision on speaker label. We model the content embedding as a series of discrete codes and take the difference between quantize-before and quantize-after vector as the speaker embedding. We show that this approach has a strong ability to disentangle the content and speaker information with reconstruction loss only, and one-shot VC is thus achieved.

本文提出了一种基于矢量量化(VQ)的单次语音转换(VC)方法，该方法无需对说话人标签进行任何监督。我们将内容嵌入建模为一系列离散码，并将量化前向量和量化后向量的差值作为说话人嵌入。我们的研究表明，该方法具有较强的分离内容和说话人信息的能力，并且只有重建损失，从而实现了一次VC。

引用次数: 64

Blind Multi-Spectral Image Pan-Sharpening 盲多光谱图像泛锐化

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053554

Lantao Yu, Dehong Liu, H. Mansour, P. Boufounos, Yanting Ma

We address the problem of sharpening low spatial-resolution multi-spectral (MS) images with their associated misaligned high spatial-resolution panchromatic (PAN) image, based on priors on the spatial blur kernel and on the cross-channel relationship. In particular, we formulate the blind pan-sharpening problem within a multi-convex optimization framework using total generalized variation for the blur kernel and local Laplacian prior for the cross-channel relationship. The problem is solved by the alternating direction method of multipliers (ADMM), which alternately updates the blur kernel and sharpens intermediate MS images. Numerical experiments demonstrate that our approach is more robust to large misalignment errors and yields better super resolved MS images compared to state-of-the-art optimization-based and deep-learning-based algorithms.

基于空间模糊核和跨通道关系的先验，我们解决了低空间分辨率多光谱(MS)图像与其相关的错位高空间分辨率全色(PAN)图像的锐化问题。特别地，我们利用模糊核的总广义变分和跨通道关系的局部拉普拉斯先验，在多凸优化框架内提出了盲泛锐化问题。该方法采用交替方向乘法器(ADMM)，交替更新模糊核和锐化中间MS图像。数值实验表明，与基于最先进的优化和基于深度学习的算法相比，我们的方法对较大的不对准误差更具鲁棒性，并产生更好的超分辨率MS图像。

引用次数: 8

On–The–Fly Feature Selection and Classification with Application to Civic Engagement Platforms 动态特征选择与分类及其在公民参与平台上的应用

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053564

Yasitha Warahena Liyanage, Daphney-Stavroula Zois, C. Chelmis

Online feature selection and classification is crucial for time sensitive decision making. Existing work however either assumes that features are independent or produces a fixed number of features for classification. Instead, we propose an optimal framework to perform joint feature selection and classification on–the–fly while relaxing the assumption on feature independence. The effectiveness of the proposed approach is showed by classifying urban issue reports on the SeeClickFix civic engagement platform. A significant reduction in the average number of features used is observed without a drop in the classification accuracy.

在线特征选择和分类是时间敏感决策的关键。然而，现有的工作要么假设特征是独立的，要么产生固定数量的特征用于分类。相反，我们提出了一个优化框架，在放松特征独立性假设的同时，实时执行联合特征选择和分类。通过在SeeClickFix公民参与平台上对城市问题报告进行分类，表明了所提出方法的有效性。在没有降低分类精度的情况下，观察到使用的平均特征数量显著减少。

引用次数: 5

Efficient Constrained Encoders Correcting a Single Nucleotide Edit in DNA Storage 有效约束编码器纠正DNA存储中的单核苷酸编辑

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053256

K. Cai, Xuan He, H. M. Kiah, T. T. Nguyen

A nucleotide substitution is said to occur when a base in {A, T} is substituted for a base in {C, G}, or vice versa. Recent experiment (Heckel et al. 2019) showed that a nucleotide substitution occurs with a significantly higher probability than other substitution errors. A nucleotide edit refers to a single insertion, deletion or nucleotide substitution. In this paper, we investigate codes that corrects a single nucleotide edit and provide linear-time algorithms that encode binary messages into these codes of length n.Specifically, we provide an order-optimal encoder which corrects a single nucleotide edit with logn + loglogn + O(1) redundant bits. We also demonstrate that the codewords obey certain runlength constraints and that the code can be modified to accommodate certain GC-content constraints.

当{A, T}中的碱基被{C, G}中的碱基取代时，就会发生核苷酸取代，反之亦然。最近的实验(Heckel et al. 2019)表明，核苷酸替换发生的概率明显高于其他替换错误。核苷酸编辑是指单个插入、删除或核苷酸替换。在本文中，我们研究了校正单个核苷酸编辑的编码，并提供了将二进制信息编码成这些长度为n的编码的线性时间算法，具体来说，我们提供了一个顺序最优的编码器，它具有logn + loglog + O(1)冗余位来校正单个核苷酸编辑。我们还演示了码字服从某些运行长度约束，并且可以修改代码以适应某些GC-content约束。

引用次数: 2

Incorporating Written Domain Numeric Grammars into End-To-End Contextual Speech Recognition Systems for Improved Recognition of Numeric Sequences 将书面领域数字语法整合到端到端上下文语音识别系统中以改进数字序列的识别

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054259

Ben Haynor, Petar S. Aleksic

Accurate recognition of numeric sequences is crucial for many contextual speech recognition applications. For example, a user might create a calendar event and be prompted by a virtual assistant for the time, date, and duration of the event. We propose a modular and scalable solution for improved recognition of numeric sequences. We use finite state transducers built from written domain numeric grammars to increase the likelihood of hypotheses containing matching numeric entities during beam search in an end-to-end speech recognition system. Using our technique results in relative reduction in word error rate of up to 59% on a variety of numeric sequence recognition tasks (times, percentages, digit sequences, …).

准确识别数字序列是许多上下文语音识别应用的关键。例如，用户可以创建一个日历事件，然后由虚拟助手提示该事件的时间、日期和持续时间。我们提出了一个模块化和可扩展的解决方案来改进数字序列的识别。在端到端语音识别系统的波束搜索过程中，我们使用由书面领域数字语法构建的有限状态换能器来增加包含匹配数字实体的假设的可能性。使用我们的技术，在各种数字序列识别任务(时间、百分比、数字序列等)上，单词错误率相对降低了59%。

引用次数: 3

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀