首页 > 最新文献

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
Compressed Sensing Based Channel Estimation and Open-loop Training Design for Hybrid Analog-digital Massive MIMO Systems 基于压缩感知的混合模数大规模MIMO系统信道估计与开环训练设计
Khaled Ardah, Bruno Sokal, A. D. Almeida, M. Haardt
Channel estimation in hybrid analog-digital massive MIMO systems is a challenging problem due to the high channel dimension, low signal-to-noise ratio before beamforming, and reduced number of radio-frequency chains. Compressed sensing based algorithms have been adopted to address these challenges by leveraging the sparse nature of millimeter-wave MIMO channels. In compressed sensing-based methods, the training vectors should be designed carefully to guarantee recoverability. Although using random vectors has an overwhelming recoverability guarantee, it has been recently shown that an optimized update, which could be obtained so that the mutual coherence of the resulting sensing matrix is minimized, can improve the recoverability guarantee. In this paper, we propose an openloop hybrid analog-digital beam-training framework, where a given sensing matrix is decomposed into analog and digital beamformers. The given sensing matrix can be designed efficiently offline to reduce computational complexity. Simulation results show that the proposed training method achieves a lower mutual coherence and an improved channel estimation performance than the other benchmark methods.
在模数混合大规模MIMO系统中,由于信道维数高、波束形成前信噪比低、射频链数少等特点,信道估计是一个具有挑战性的问题。通过利用毫米波MIMO信道的稀疏特性,采用了基于压缩感知的算法来解决这些挑战。在基于压缩感知的方法中,必须仔细设计训练向量以保证可恢复性。虽然使用随机向量具有压倒性的可恢复性保证,但最近的研究表明,可以获得优化更新,使所得传感矩阵的相互相干性最小化,可以提高可恢复性保证。在本文中,我们提出了一个开环混合模拟-数字波束训练框架,其中给定的传感矩阵被分解为模拟和数字波束形成器。给定的传感矩阵可以离线有效地设计,以降低计算复杂度。仿真结果表明,该训练方法具有较低的相互相干性和较好的信道估计性能。
{"title":"Compressed Sensing Based Channel Estimation and Open-loop Training Design for Hybrid Analog-digital Massive MIMO Systems","authors":"Khaled Ardah, Bruno Sokal, A. D. Almeida, M. Haardt","doi":"10.1109/ICASSP40776.2020.9054443","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054443","url":null,"abstract":"Channel estimation in hybrid analog-digital massive MIMO systems is a challenging problem due to the high channel dimension, low signal-to-noise ratio before beamforming, and reduced number of radio-frequency chains. Compressed sensing based algorithms have been adopted to address these challenges by leveraging the sparse nature of millimeter-wave MIMO channels. In compressed sensing-based methods, the training vectors should be designed carefully to guarantee recoverability. Although using random vectors has an overwhelming recoverability guarantee, it has been recently shown that an optimized update, which could be obtained so that the mutual coherence of the resulting sensing matrix is minimized, can improve the recoverability guarantee. In this paper, we propose an openloop hybrid analog-digital beam-training framework, where a given sensing matrix is decomposed into analog and digital beamformers. The given sensing matrix can be designed efficiently offline to reduce computational complexity. Simulation results show that the proposed training method achieves a lower mutual coherence and an improved channel estimation performance than the other benchmark methods.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"4597-4601"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76943885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation 深度神经网络多通道去噪与语音源分离的联合训练
M. Togami
In this paper, we propose a joint training of two deep neural networks (DNNs) for dereverberation and speech source separation. The proposed method connects the first DNN, the dereverberation part, the second DNN, and the speech source separation part in a cascade manner. The proposed method does not train each DNN separately. Instead, an integrated loss function which evaluates an output signal after dereverberation and speech source separation is adopted. The proposed method estimates the output signal as a probabilistic variable. Recently, in the speech source separation context, we proposed a loss function which evaluates the estimated posterior probability density function (PDF) of the output signal. In this paper, we extend this loss function into a loss function which evaluates not only speech source separation performance but also speech derevereberation performance. Since the output signal of the dereverberation part is converted into the input feature of the second DNN, gradient of the loss function is back-propagated into the first DNN through the input feature of the second DNN. Experimental results show that the proposed joint training of two DNNs is effective. It is also shown that the posterior PDF based loss function is effective in the joint training context.
在本文中,我们提出了两个深度神经网络(dnn)的联合训练,用于去音高和语音源分离。该方法将第一个DNN、去噪部分、第二个DNN和语音源分离部分以级联方式连接起来。该方法不需要单独训练每个深度神经网络。取而代之的是,采用一个积分损失函数来评估经过去噪和语音源分离后的输出信号。该方法将输出信号作为一个概率变量进行估计。最近,在语音源分离的背景下,我们提出了一种损失函数来评估输出信号的估计后验概率密度函数(PDF)。在本文中,我们将这个损失函数扩展成一个既能评估语音源分离性能又能评估语音去噪性能的损失函数。由于去噪部分的输出信号被转换为第二个DNN的输入特征,因此损失函数的梯度通过第二个DNN的输入特征反向传播到第一个DNN中。实验结果表明,提出的两个dnn联合训练方法是有效的。研究还表明,基于后验PDF的损失函数在关节训练中是有效的。
{"title":"Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation","authors":"M. Togami","doi":"10.1109/ICASSP40776.2020.9053791","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053791","url":null,"abstract":"In this paper, we propose a joint training of two deep neural networks (DNNs) for dereverberation and speech source separation. The proposed method connects the first DNN, the dereverberation part, the second DNN, and the speech source separation part in a cascade manner. The proposed method does not train each DNN separately. Instead, an integrated loss function which evaluates an output signal after dereverberation and speech source separation is adopted. The proposed method estimates the output signal as a probabilistic variable. Recently, in the speech source separation context, we proposed a loss function which evaluates the estimated posterior probability density function (PDF) of the output signal. In this paper, we extend this loss function into a loss function which evaluates not only speech source separation performance but also speech derevereberation performance. Since the output signal of the dereverberation part is converted into the input feature of the second DNN, gradient of the loss function is back-propagated into the first DNN through the input feature of the second DNN. Experimental results show that the proposed joint training of two DNNs is effective. It is also shown that the posterior PDF based loss function is effective in the joint training context.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"3032-3036"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81344185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Deblurring And Super-Resolution Using Deep Gated Fusion Attention Networks For Face Images 基于深度门控融合注意网络的人脸图像去模糊和超分辨率
Chao Yang, Long-Wen Chang
Image deblurring and super-resolution are very important in image processing such as face verification. However, when in the outdoors, we often get blurry and low resolution images. To solve the problem, we propose a deep gated fusion attention network (DGFAN) to generate a high resolution image without blurring artifacts. We extract features from two task-independent structures for deburring and super-resolution to avoid the error propagation in the cascade structure of deblurring and super-resolution. We also add an attention module in our network by using channel-wise and spatial-wise features for better features and propose an edge loss function to make the model focus on facial features like eyes and nose. DGFAN performs favorably against the state-of-arts methods in terms of PSNR and SSIM. Also, using the clear images generated by DGFAN can improve the accuracy on face verification.
在人脸验证等图像处理中,图像去模糊和超分辨率是非常重要的。然而,在户外,我们经常得到模糊和低分辨率的图像。为了解决这一问题,我们提出了一种深度门控融合注意网络(DGFAN)来生成无模糊伪影的高分辨率图像。我们从去毛刺和超分辨率两个任务无关的结构中提取特征,以避免去模糊和超分辨率级联结构中的误差传播。我们还在我们的网络中添加了一个注意力模块,通过使用通道智能和空间智能特征来获得更好的特征,并提出了一个边缘损失函数,使模型专注于眼睛和鼻子等面部特征。DGFAN在PSNR和SSIM方面优于最先进的方法。此外,利用DGFAN生成的清晰图像可以提高人脸验证的准确性。
{"title":"Deblurring And Super-Resolution Using Deep Gated Fusion Attention Networks For Face Images","authors":"Chao Yang, Long-Wen Chang","doi":"10.1109/ICASSP40776.2020.9053784","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053784","url":null,"abstract":"Image deblurring and super-resolution are very important in image processing such as face verification. However, when in the outdoors, we often get blurry and low resolution images. To solve the problem, we propose a deep gated fusion attention network (DGFAN) to generate a high resolution image without blurring artifacts. We extract features from two task-independent structures for deburring and super-resolution to avoid the error propagation in the cascade structure of deblurring and super-resolution. We also add an attention module in our network by using channel-wise and spatial-wise features for better features and propose an edge loss function to make the model focus on facial features like eyes and nose. DGFAN performs favorably against the state-of-arts methods in terms of PSNR and SSIM. Also, using the clear images generated by DGFAN can improve the accuracy on face verification.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1623-1627"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81372307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection 重播检测中说话人归一化的对抗性多任务学习
Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah
Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learning to compensate for this speaker variability. The proposed system is designed to learn a feature space that discriminates between genuine and replayed speech while simultaneously reduces the discrimination between different speakers. We initially characterise the impact of speaker variability and quantify the effect of the proposed speaker normalisation technique directly on the feature distributions. Following this, we validate the technique on spoofing detection experiments carried out on two different corpora, ASVSpoof 2017 v2.0 and BTAS 2016 replay, and demonstrate its effectiveness. We obtain EER of 7.11% and 0.83% on the two corpora respectively, lower than that of all relevant baselines.
语音生物识别中的欺骗检测算法受到不同目标用户语音特征差异的不利影响。在本文中,我们提出了一种新的说话人归一化技术,该技术采用对抗性多任务学习来补偿这种说话人的可变性。该系统旨在学习区分真实语音和重播语音的特征空间,同时减少不同说话者之间的区分。我们首先描述了说话人变异性的影响,并量化了所提出的说话人归一化技术对特征分布的直接影响。随后,我们在ASVSpoof 2017 v2.0和BTAS 2016 replay两个不同的语料库上进行了欺骗检测实验,验证了该技术的有效性。我们在两个语料库上分别获得了7.11%和0.83%的EER,低于所有相关基线。
{"title":"Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection","authors":"Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah","doi":"10.1109/ICASSP40776.2020.9054322","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054322","url":null,"abstract":"Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learning to compensate for this speaker variability. The proposed system is designed to learn a feature space that discriminates between genuine and replayed speech while simultaneously reduces the discrimination between different speakers. We initially characterise the impact of speaker variability and quantify the effect of the proposed speaker normalisation technique directly on the feature distributions. Following this, we validate the technique on spoofing detection experiments carried out on two different corpora, ASVSpoof 2017 v2.0 and BTAS 2016 replay, and demonstrate its effectiveness. We obtain EER of 7.11% and 0.83% on the two corpora respectively, lower than that of all relevant baselines.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"6609-6613"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81944233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Differential Approach for Rain Field Tomographic Reconstruction Using Microwave Signals from Leo Satellites 利用低轨道卫星微波信号进行雨场层析成像的差分方法
Xi Shen, D. Huang, C. Vincent, Wenxiao Wang, R. Togneri
A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using least squares to reconstruct the attenuation field. Simulations are done when the baseline is modelled by an autoregressive process and when the baseline is assumed fixed. Comparisons between the reconstruction results for the differential and non-differential approaches suggest that the differential approach performs better in both scenarios. For high correlation coefficient and low model noise in the autoregressive process, the differential approach surpasses the non-differential approach significantly.
提出了一种利用近地轨道卫星微波信号在地面接收机处的估计信噪比进行层析雨场重建的差分方法,剔除未知基线值,然后利用最小二乘法重建衰减场。当基线由自回归过程建模并假设基线固定时,进行模拟。对微分和非微分方法重建结果的比较表明,微分方法在两种情况下都表现得更好。对于自回归过程中相关系数高、模型噪声小的特点,微分方法明显优于非微分方法。
{"title":"A Differential Approach for Rain Field Tomographic Reconstruction Using Microwave Signals from Leo Satellites","authors":"Xi Shen, D. Huang, C. Vincent, Wenxiao Wang, R. Togneri","doi":"10.1109/ICASSP40776.2020.9054284","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054284","url":null,"abstract":"A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using least squares to reconstruct the attenuation field. Simulations are done when the baseline is modelled by an autoregressive process and when the baseline is assumed fixed. Comparisons between the reconstruction results for the differential and non-differential approaches suggest that the differential approach performs better in both scenarios. For high correlation coefficient and low model noise in the autoregressive process, the differential approach surpasses the non-differential approach significantly.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"42 1","pages":"9001-9005"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82470731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Stochastic Graph Neural Networks 随机图神经网络
Zhan Gao, E. Isufi, Alejandro Ribeiro
Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning among others. However, current GNN implementations assume ideal distributed scenarios and ignore link fluctuations that occur due to environment or human factors. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly. To overcome this issue, we put forth the stochastic graph neural network (SGNN) model: a GNN where the distributed graph convolutional operator is modified to account for the network changes. Since stochasticity brings in a new paradigm, we develop a novel learning process for the SGNN and introduce the stochastic gradient descent (SGD) algorithm to estimate the parameters. We prove through the SGD that the SGNN learning process converges to a stationary point under mild Lipschitz assumptions. Numerical simulations corroborate the proposed theory and show an improved performance of the SGNN compared with the conventional GNN when operating over random time varying graphs.
图神经网络(gnn)对图数据中的非线性表示进行建模,并应用于分布式代理协调、控制和规划等领域。然而,目前的GNN实现假设了理想的分布式场景,忽略了由于环境或人为因素而发生的链路波动。在这些情况下,如果不相应地考虑拓扑随机性,GNN将无法解决其分布式任务。为了克服这个问题,我们提出了随机图神经网络(SGNN)模型:一个修改分布式图卷积算子以考虑网络变化的GNN。由于随机性带来了一种新的范式,我们开发了一种新的SGNN学习过程,并引入了随机梯度下降(SGD)算法来估计参数。在温和的Lipschitz假设下,我们通过SGD证明了SGNN学习过程收敛于平稳点。数值模拟验证了该理论,并表明在随机时变图上运行时,与传统GNN相比,SGNN具有更好的性能。
{"title":"Stochastic Graph Neural Networks","authors":"Zhan Gao, E. Isufi, Alejandro Ribeiro","doi":"10.1109/ICASSP40776.2020.9054424","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054424","url":null,"abstract":"Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning among others. However, current GNN implementations assume ideal distributed scenarios and ignore link fluctuations that occur due to environment or human factors. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly. To overcome this issue, we put forth the stochastic graph neural network (SGNN) model: a GNN where the distributed graph convolutional operator is modified to account for the network changes. Since stochasticity brings in a new paradigm, we develop a novel learning process for the SGNN and introduce the stochastic gradient descent (SGD) algorithm to estimate the parameters. We prove through the SGD that the SGNN learning process converges to a stationary point under mild Lipschitz assumptions. Numerical simulations corroborate the proposed theory and show an improved performance of the SGNN compared with the conventional GNN when operating over random time varying graphs.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2007 1","pages":"9080-9084"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82504172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Playing Technique Recognition by Joint Time–Frequency Scattering 基于联合时频散射的演奏技术识别
Changhong Wang, V. Lostanlen, Emmanouil Benetos, E. Chew
Playing techniques are important expressive elements in music signals. In this paper, we propose a recognition system based on the joint time–frequency scattering transform (jTFST) for pitch evolution-based playing techniques (PETs), a group of playing techniques with monotonic pitch changes over time. The jTFST represents spectro-temporal patterns in the time–frequency domain, capturing discriminative information of PETs. As a case study, we analyse three commonly used PETs of the Chinese bamboo flute: acciacatura, portamento, and glissando, and encode their characteristics using the jTFST. To verify the proposed approach, we create a new dataset, the CBF-petsDB, containing PETs played in isolation as well as in the context of whole pieces performed and annotated by professional players. Feeding the jTFST to a machine learning classifier, we obtain F-measures of 71% for acciacatura, 59% for portamento, and 83% for glissando detection, and provide explanatory visualisations of scattering coefficients for each technique.
演奏技巧是音乐信号中重要的表现元素。本文提出了一种基于联合时频散射变换(jTFST)的基于音高演化的演奏技术(pet)识别系统,pet是一组音调随时间单调变化的演奏技术。jTFST在时频域中表示光谱-时间模式,捕获pet的判别信息。以中国竹笛为例,分析了竹笛中常用的三种声部特征:弹拨、奏调和滑音,并利用jTFST对其特征进行了编码。为了验证所提出的方法,我们创建了一个新的数据集CBF-petsDB,其中包含单独演奏的pet以及由专业演奏者演奏和注释的整首曲子。将jTFST输入到机器学习分类器中,我们获得了琴音检测的f值为71%,portamento检测的f值为59%,滑音检测的f值为83%,并提供了每种技术散射系数的解释性可视化。
{"title":"Playing Technique Recognition by Joint Time–Frequency Scattering","authors":"Changhong Wang, V. Lostanlen, Emmanouil Benetos, E. Chew","doi":"10.1109/ICASSP40776.2020.9053474","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053474","url":null,"abstract":"Playing techniques are important expressive elements in music signals. In this paper, we propose a recognition system based on the joint time–frequency scattering transform (jTFST) for pitch evolution-based playing techniques (PETs), a group of playing techniques with monotonic pitch changes over time. The jTFST represents spectro-temporal patterns in the time–frequency domain, capturing discriminative information of PETs. As a case study, we analyse three commonly used PETs of the Chinese bamboo flute: acciacatura, portamento, and glissando, and encode their characteristics using the jTFST. To verify the proposed approach, we create a new dataset, the CBF-petsDB, containing PETs played in isolation as well as in the context of whole pieces performed and annotated by professional players. Feeding the jTFST to a machine learning classifier, we obtain F-measures of 71% for acciacatura, 59% for portamento, and 83% for glissando detection, and provide explanatory visualisations of scattering coefficients for each technique.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"881-885"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78833403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An Efficient Augmented Lagrangian-Based Method for Linear Equality-Constrained Lasso 线性等式约束Lasso的一种有效增广拉格朗日方法
Zengde Deng, Man-Chung Yue, A. M. So
Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, various constrained Lasso models have been proposed in the literature. Compared with the classic (unconstrained) Lasso model, the algorithmic aspects of constrained Lasso models are much less explored. In this paper, we demonstrate how the recently developed semis-mooth Newton-based augmented Lagrangian framework can be extended to solve a linear equality-constrained Lasso model. A key technical challenge that is not present in prior works is the lack of strong convexity in our dual problem, which we overcome by adopting a regularization strategy. We show that under mild assumptions, our proposed method will converge superlinearly. Moreover, extensive numerical experiments on both synthetic and real-world data show that our method can be substantially faster than existing first-order methods while achieving a better solution accuracy.
变量选择是统计学和机器学习中最重要的任务之一。为了纳入更多关于回归系数的先验信息,文献中提出了各种约束Lasso模型。与经典的(无约束的)Lasso模型相比,约束Lasso模型的算法方面的探索要少得多。在本文中,我们展示了如何将最近发展的半光滑基于牛顿的增广拉格朗日框架扩展到求解线性等式约束的Lasso模型。在以前的工作中没有出现的一个关键技术挑战是我们的对偶问题中缺乏强凸性,我们通过采用正则化策略来克服这个问题。在较温和的假设条件下,我们提出的方法是超线性收敛的。此外,在合成数据和实际数据上进行的大量数值实验表明,我们的方法可以比现有的一阶方法快得多,同时获得更好的解精度。
{"title":"An Efficient Augmented Lagrangian-Based Method for Linear Equality-Constrained Lasso","authors":"Zengde Deng, Man-Chung Yue, A. M. So","doi":"10.1109/ICASSP40776.2020.9053722","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053722","url":null,"abstract":"Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, various constrained Lasso models have been proposed in the literature. Compared with the classic (unconstrained) Lasso model, the algorithmic aspects of constrained Lasso models are much less explored. In this paper, we demonstrate how the recently developed semis-mooth Newton-based augmented Lagrangian framework can be extended to solve a linear equality-constrained Lasso model. A key technical challenge that is not present in prior works is the lack of strong convexity in our dual problem, which we overcome by adopting a regularization strategy. We show that under mild assumptions, our proposed method will converge superlinearly. Moreover, extensive numerical experiments on both synthetic and real-world data show that our method can be substantially faster than existing first-order methods while achieving a better solution accuracy.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"5760-5764"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78894006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Volume Reconstruction for Light Field Microscopy 光场显微镜的体积重建
Herman Verinaz-Jadan, P. Song, Carmel L. Howe, Amanda J. Foust, P. Dragotti
Light Field Microscopy (LFM) is a 3D imaging technique that captures volumetric information in a single snapshot. It is appealing in microscopy because of its simple implementation and the peculiarity that it is much faster than methods involving scanning. However, volume reconstruction for LFM suffers from low lateral resolution, high computational cost, and reconstruction artifacts near the native object plane. In this work, we make two contributions. First, we propose a simplification of the forward model based on a novel discretization approach that allows us to accelerate the computation without drastically increasing memory consumption. Second, we experimentally show that by including regularization priors and an appropriate initialization strategy, it is possible to remove the artifacts near the native object plane. The algorithm we use for this is ADMM. Finally, the combination of the two techniques leads to a method that outperforms classic volume reconstruction approaches (variants of Richardson-Lucy) in terms of average computational time and image quality (PSNR).
光场显微镜(LFM)是一种3D成像技术,可以在单个快照中捕获体积信息。它在显微镜中很有吸引力,因为它的实现简单,而且比涉及扫描的方法快得多。然而,LFM的体积重建存在横向分辨率低、计算成本高、重建伪影靠近目标平面等问题。在这项工作中,我们有两个贡献。首先,我们提出了一种基于新的离散化方法的前向模型的简化,该方法允许我们在不大幅增加内存消耗的情况下加速计算。其次,我们通过实验表明,通过包含正则化先验和适当的初始化策略,可以去除本机对象平面附近的工件。我们使用的算法是ADMM。最后,两种技术的结合产生了一种在平均计算时间和图像质量(PSNR)方面优于经典体积重建方法(Richardson-Lucy的变体)的方法。
{"title":"Volume Reconstruction for Light Field Microscopy","authors":"Herman Verinaz-Jadan, P. Song, Carmel L. Howe, Amanda J. Foust, P. Dragotti","doi":"10.1109/ICASSP40776.2020.9053433","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053433","url":null,"abstract":"Light Field Microscopy (LFM) is a 3D imaging technique that captures volumetric information in a single snapshot. It is appealing in microscopy because of its simple implementation and the peculiarity that it is much faster than methods involving scanning. However, volume reconstruction for LFM suffers from low lateral resolution, high computational cost, and reconstruction artifacts near the native object plane. In this work, we make two contributions. First, we propose a simplification of the forward model based on a novel discretization approach that allows us to accelerate the computation without drastically increasing memory consumption. Second, we experimentally show that by including regularization priors and an appropriate initialization strategy, it is possible to remove the artifacts near the native object plane. The algorithm we use for this is ADMM. Finally, the combination of the two techniques leads to a method that outperforms classic volume reconstruction approaches (variants of Richardson-Lucy) in terms of average computational time and image quality (PSNR).","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1459-1463"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78993354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Supervised Deep Hashing for Efficient Audio Event Retrieval 有效音频事件检索的监督深度哈希
Arindam Jati, Dimitra Emmanouilidou
Efficient retrieval of audio events can facilitate real-time implementation of numerous query and search-based systems. This work investigates the potency of different hashing techniques for efficient audio event retrieval. Multiple state-of-the-art weak audio embeddings are employed for this purpose. The performance of four classical unsupervised hashing algorithms is explored as part of off-the-shelf analysis. Then, we propose a partially supervised deep hashing framework that transforms the weak embeddings into a low-dimensional space while optimizing for efficient hash codes. The model uses only a fraction of the available labels and is shown here to significantly improve the retrieval accuracy on two widely employed audio event datasets. The extensive analysis and comparison between supervised and unsupervised hashing methods presented here, give insights on the quantizability of audio embeddings. This work provides a first look in efficient audio event retrieval systems and hopes to set baselines for future research.
音频事件的有效检索可以促进许多基于查询和搜索的系统的实时实现。这项工作调查了不同的哈希技术的效力,有效的音频事件检索。为此目的采用了多个最先进的弱音频嵌入。四种经典的无监督散列算法的性能作为现成分析的一部分进行了探讨。然后,我们提出了一个部分监督的深度哈希框架,该框架将弱嵌入转换为低维空间,同时优化有效的哈希码。该模型仅使用了可用标签的一小部分,并且在两个广泛使用的音频事件数据集上显着提高了检索精度。本文对有监督哈希和无监督哈希方法进行了广泛的分析和比较,对音频嵌入的可量化性提供了见解。这项工作为有效的音频事件检索系统提供了第一个视角,并希望为未来的研究奠定基础。
{"title":"Supervised Deep Hashing for Efficient Audio Event Retrieval","authors":"Arindam Jati, Dimitra Emmanouilidou","doi":"10.1109/ICASSP40776.2020.9053766","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053766","url":null,"abstract":"Efficient retrieval of audio events can facilitate real-time implementation of numerous query and search-based systems. This work investigates the potency of different hashing techniques for efficient audio event retrieval. Multiple state-of-the-art weak audio embeddings are employed for this purpose. The performance of four classical unsupervised hashing algorithms is explored as part of off-the-shelf analysis. Then, we propose a partially supervised deep hashing framework that transforms the weak embeddings into a low-dimensional space while optimizing for efficient hash codes. The model uses only a fraction of the available labels and is shown here to significantly improve the retrieval accuracy on two widely employed audio event datasets. The extensive analysis and comparison between supervised and unsupervised hashing methods presented here, give insights on the quantizability of audio embeddings. This work provides a first look in efficient audio event retrieval systems and hopes to set baselines for future research.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"4497-4501"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76293284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1