首页 > 最新文献

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
A Variational Bayesian Approach for Multichannel Through-Wall Radar Imaging with Low-Rank and Sparse Priors 多通道低秩稀疏雷达成像的变分贝叶斯方法
Van Ha Tang, A. Bouzerdoum, S. L. Phung
This paper considers the problem of multichannel through-wall radar (TWR) imaging from a probabilistic Bayesian perspective. Given the observed radar signals, a joint distribution of the observed data and latent variables is formulated by incorporating two important beliefs: low-dimensional structure of wall reflections and joint sparsity among channel images. These priors are modeled through probabilistic distributions whose hyperparameters are treated with a full Bayesian formulation. Furthermore, the paper presents a variational Bayesian inference algorithm that captures wall clutter and provides channel images as full posterior distributions. Experimental results on real data show that the proposed model is very effective at removing wall clutter and enhancing target localization.
从概率贝叶斯的角度研究了多通道穿壁雷达(TWR)成像问题。给定观测到的雷达信号,通过结合两个重要信念:墙反射的低维结构和通道图像之间的联合稀疏性,制定了观测数据和潜在变量的联合分布。这些先验通过概率分布建模,其超参数用全贝叶斯公式处理。此外,本文提出了一种变分贝叶斯推理算法,该算法捕获墙壁杂波并提供信道图像作为完整的后验分布。在实际数据上的实验结果表明,该模型在去除杂波和增强目标定位方面非常有效。
{"title":"A Variational Bayesian Approach for Multichannel Through-Wall Radar Imaging with Low-Rank and Sparse Priors","authors":"Van Ha Tang, A. Bouzerdoum, S. L. Phung","doi":"10.1109/ICASSP40776.2020.9054515","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054515","url":null,"abstract":"This paper considers the problem of multichannel through-wall radar (TWR) imaging from a probabilistic Bayesian perspective. Given the observed radar signals, a joint distribution of the observed data and latent variables is formulated by incorporating two important beliefs: low-dimensional structure of wall reflections and joint sparsity among channel images. These priors are modeled through probabilistic distributions whose hyperparameters are treated with a full Bayesian formulation. Furthermore, the paper presents a variational Bayesian inference algorithm that captures wall clutter and provides channel images as full posterior distributions. Experimental results on real data show that the proposed model is very effective at removing wall clutter and enhancing target localization.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"961 1","pages":"2523-2527"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85623736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Building Firmly Nonexpansive Convolutional Neural Networks 构建牢固的非膨胀卷积神经网络
M. Terris, A. Repetti, J. Pesquet, Y. Wiaux
Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the image processing community. In particular, it appears to be the key to obtain convergent Plugand-Play algorithms. This problem, which relies on an accurate control of the the Lipschitz constant of the convolutional layers, has also been investigated for Generative Adversarial Networks to improve robustness to adversarial perturbations. However, to the best of our knowledge, no efficient method has been developed yet to build nonexpansive CNNs. In this paper, we develop an optimization algorithm that can be incorporated in the training of a network to ensure the nonexpansiveness of its convolutional layers. This is shown to allow us to build firmly nonexpansive CNNs. We apply the proposed approach to train a CNN for an image denoising task and show its effectiveness through simulations.
构建非膨胀卷积神经网络(cnn)是一个具有挑战性的问题,近年来得到了图像处理界的广泛关注。特别是,它似乎是获得即插即用算法的关键。这个问题依赖于对卷积层的Lipschitz常数的精确控制,也被研究用于生成对抗网络,以提高对对抗扰动的鲁棒性。然而,据我们所知,目前还没有开发出有效的方法来构建非膨胀cnn。在本文中,我们开发了一种优化算法,该算法可以纳入网络的训练中,以确保其卷积层的非扩展性。这被证明可以让我们建立坚固的非膨胀cnn。我们将该方法应用于训练CNN进行图像去噪任务,并通过仿真验证了其有效性。
{"title":"Building Firmly Nonexpansive Convolutional Neural Networks","authors":"M. Terris, A. Repetti, J. Pesquet, Y. Wiaux","doi":"10.1109/ICASSP40776.2020.9054731","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054731","url":null,"abstract":"Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the image processing community. In particular, it appears to be the key to obtain convergent Plugand-Play algorithms. This problem, which relies on an accurate control of the the Lipschitz constant of the convolutional layers, has also been investigated for Generative Adversarial Networks to improve robustness to adversarial perturbations. However, to the best of our knowledge, no efficient method has been developed yet to build nonexpansive CNNs. In this paper, we develop an optimization algorithm that can be incorporated in the training of a network to ensure the nonexpansiveness of its convolutional layers. This is shown to allow us to build firmly nonexpansive CNNs. We apply the proposed approach to train a CNN for an image denoising task and show its effectiveness through simulations.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"74 1","pages":"8658-8662"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85992469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Assessing the Scope of Generalized Countermeasures for Anti-Spoofing 评估反欺骗通用对策的范围
Rohan Kumar Das, Jichen Yang, Haizhou Li
Most of the research on anti-spoofing countermeasures are specific to a type of spoofing attacks, where models are trained on data of a particular nature, either synthetic or replay. However, one does not have such leverage as there is no prior knowledge about the kind of spoofing attack in practice. Therefore, there is a requirement to assess the scope of generalized countermeasures for anti-spoofing. The ASVspoof 2019challengecoversboth synthetic as well as replay attacks, which makes the database suitable for such study. In this work, we consider widely popular constant-Q cepstral coefficient features along with two other promising front-ends that capture long-term signal characteristics to assess their scope as generalized countermeasures. Additionally, a comprehensive study is made across different editions of ASVspoof corpora to highlight the need of robust generalized countermeasures in unseen conditions.
大多数关于反欺骗对策的研究都是针对某种类型的欺骗攻击,其中模型是根据特定性质的数据进行训练的,要么是合成的,要么是重放的。但是,由于在实践中没有关于这种欺骗攻击的先验知识,因此没有这样的杠杆作用。因此,有必要评估反欺骗的广义对策的范围。ASVspoof 2019挑战既包括合成攻击也包括重放攻击,这使得该数据库适合此类研究。在这项工作中,我们考虑了广泛流行的常q倒谱系数特征以及另外两个有前途的前端,它们捕获长期信号特征,以评估它们作为广义对策的范围。此外,对不同版本的ASVspoof语料库进行了全面的研究,以强调在未知条件下稳健的广义对策的必要性。
{"title":"Assessing the Scope of Generalized Countermeasures for Anti-Spoofing","authors":"Rohan Kumar Das, Jichen Yang, Haizhou Li","doi":"10.1109/ICASSP40776.2020.9053086","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053086","url":null,"abstract":"Most of the research on anti-spoofing countermeasures are specific to a type of spoofing attacks, where models are trained on data of a particular nature, either synthetic or replay. However, one does not have such leverage as there is no prior knowledge about the kind of spoofing attack in practice. Therefore, there is a requirement to assess the scope of generalized countermeasures for anti-spoofing. The ASVspoof 2019challengecoversboth synthetic as well as replay attacks, which makes the database suitable for such study. In this work, we consider widely popular constant-Q cepstral coefficient features along with two other promising front-ends that capture long-term signal characteristics to assess their scope as generalized countermeasures. Additionally, a comprehensive study is made across different editions of ASVspoof corpora to highlight the need of robust generalized countermeasures in unseen conditions.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"312 1","pages":"6589-6593"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76891659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
An Improved Solution to the Frequency-Invariant Beamforming with Concentric Circular Microphone Arrays
Xudong Zhao, Gongping Huang, Jingdong Chen, J. Benesty
Frequency-invariant beamforming with circular microphone arrays (CMAs) has drawn a significant amount of attention for its steering flexibility and high directivity. However, frequency-invariant beam-forming with CMAs often suffers from the so-called null problem, which is caused by the zeros of the Bessel functions; then, concentric CMAs (CCMAs) are used to deal with this problem. While frequency-invariant beamforming with CCMAs can mitigate the null problem, the beampattern is still suffering from distortion due to s-patial aliasing at high frequencies. In this paper, we find that the spatial aliasing problem is caused by higher-order circular harmonics. To deal with this problem, we take the aliasing harmonics into account and approximate the beampattern with a higher truncation order of the Jacobi-Anger expansion than required. Then, the beam-forming filter is determined by minimizing the errors between the desired directivity pattern and the approximated one. Simulation results show that the developed method can mitigate the distortion of the beampattern caused by spatial aliasing.
圆形传声器阵列的频率不变波束形成技术以其转向灵活性和高指向性而受到广泛关注。然而,使用cma的频率不变波束形成通常会遇到所谓的零问题,这是由贝塞尔函数的零引起的;然后,采用同心cma (ccma)来处理这一问题。虽然使用ccma的频率不变波束形成可以缓解零问题,但由于高频下的s偏混叠,波束方向图仍然受到畸变的影响。在本文中,我们发现空间混叠问题是由高阶圆谐波引起的。为了解决这个问题,我们考虑了混叠谐波,并使用比要求更高的Jacobi-Anger展开截断阶来近似波束模式。然后,通过最小化期望方向图与近似方向图之间的误差来确定波束形成滤波器。仿真结果表明,该方法能有效地抑制空间混叠引起的波束畸变。
{"title":"An Improved Solution to the Frequency-Invariant Beamforming with Concentric Circular Microphone Arrays","authors":"Xudong Zhao, Gongping Huang, Jingdong Chen, J. Benesty","doi":"10.1109/ICASSP40776.2020.9054141","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054141","url":null,"abstract":"Frequency-invariant beamforming with circular microphone arrays (CMAs) has drawn a significant amount of attention for its steering flexibility and high directivity. However, frequency-invariant beam-forming with CMAs often suffers from the so-called null problem, which is caused by the zeros of the Bessel functions; then, concentric CMAs (CCMAs) are used to deal with this problem. While frequency-invariant beamforming with CCMAs can mitigate the null problem, the beampattern is still suffering from distortion due to s-patial aliasing at high frequencies. In this paper, we find that the spatial aliasing problem is caused by higher-order circular harmonics. To deal with this problem, we take the aliasing harmonics into account and approximate the beampattern with a higher truncation order of the Jacobi-Anger expansion than required. Then, the beam-forming filter is determined by minimizing the errors between the desired directivity pattern and the approximated one. Simulation results show that the developed method can mitigate the distortion of the beampattern caused by spatial aliasing.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"556-560"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76979391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Teaching Signals and Systems - A First Course in Signal Processing 信号与系统教学-信号处理第一课
Nikhar P. Rakhashia, Ankit A. Bhurane, V. Gadre
Signals and systems is a well known fundamental course in signal processing. How this course is taught to a student can spell the difference between whether s/he pursues a career in this field or not. Giving due consideration to this matter, this paper reflects on the experiences in teaching this course. In addition, the authors share the experiences of creating and conducting a Massive Open Online Course (MOOC) on this subject under edX and subsequently following it up with deliberation among some students who did this course through the platform. Further, this paper emphasizes on various active learning techniques and modes of evaluation to ensure effective and holistic learning of the course.
信号与系统是一门著名的信号处理基础课程。如何教授这门课程可以决定学生是否在这个领域从事职业。针对这一问题,本文对该课程的教学经验进行了反思。此外,作者还分享了在edX下创建和实施大规模开放在线课程(MOOC)的经验,以及随后通过该平台学习该课程的一些学生的思考。此外,本文强调了各种主动学习技术和评估模式,以确保课程的有效和全面学习。
{"title":"Teaching Signals and Systems - A First Course in Signal Processing","authors":"Nikhar P. Rakhashia, Ankit A. Bhurane, V. Gadre","doi":"10.1109/ICASSP40776.2020.9054231","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054231","url":null,"abstract":"Signals and systems is a well known fundamental course in signal processing. How this course is taught to a student can spell the difference between whether s/he pursues a career in this field or not. Giving due consideration to this matter, this paper reflects on the experiences in teaching this course. In addition, the authors share the experiences of creating and conducting a Massive Open Online Course (MOOC) on this subject under edX and subsequently following it up with deliberation among some students who did this course through the platform. Further, this paper emphasizes on various active learning techniques and modes of evaluation to ensure effective and holistic learning of the course.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"247 1","pages":"9224-9228"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76987245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Frequency-Domain BSS Method Based on ℓ1 Norm, Unitary Constraint, and Cayley Transform 基于1范数、酉约束和Cayley变换的频域BSS方法
S. Emura, H. Sawada, S. Araki, N. Harada
We propose a frequency-domain blind source separation method that uses (a) the ℓ1 norm of orthonormal vectors of estimated source signals as a sparsity measure and (b) Cayley transform for optimizing the objective function under the unitary constraint in the Riemannian geometry approach. The orthonormal vectors of estimated source signals, obtained by the sphering of observed mixed signals and the unitary constraint on the separation filters, enables us to use the ℓ1 norm properly as a sparsity measure. The Cayley transform enables us to handle the geometrical aspects of the unitary constraint efficiently. According to the simulation of a two-channel case, the proposed method achieved a 20-dB improvement in the source-to-interference ratio in a room with a reverberation time of T60 = 300ms.
我们提出了一种频域盲源分离方法,该方法使用(a)估计的源信号的正交向量的v1范数作为稀疏度度量,(b)在黎曼几何方法的酉约束下使用Cayley变换优化目标函数。由观测到的混合信号的球化和分离滤波器的统一约束得到的估计源信号的标准正交向量,使我们能够适当地使用l1范数作为稀疏度度量。Cayley变换使我们能够有效地处理酉约束的几何方面。通过对双通道情况的仿真,在混响时间为T60 = 300ms的房间中,提出的方法使源干扰比提高了20 db。
{"title":"A Frequency-Domain BSS Method Based on ℓ1 Norm, Unitary Constraint, and Cayley Transform","authors":"S. Emura, H. Sawada, S. Araki, N. Harada","doi":"10.1109/ICASSP40776.2020.9053757","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053757","url":null,"abstract":"We propose a frequency-domain blind source separation method that uses (a) the ℓ1 norm of orthonormal vectors of estimated source signals as a sparsity measure and (b) Cayley transform for optimizing the objective function under the unitary constraint in the Riemannian geometry approach. The orthonormal vectors of estimated source signals, obtained by the sphering of observed mixed signals and the unitary constraint on the separation filters, enables us to use the ℓ1 norm properly as a sparsity measure. The Cayley transform enables us to handle the geometrical aspects of the unitary constraint efficiently. According to the simulation of a two-channel case, the proposed method achieved a 20-dB improvement in the source-to-interference ratio in a room with a reverberation time of T60 = 300ms.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"48 1","pages":"111-115"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77148137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exploring Entity-Level Spatial Relationships for Image-Text Matching 探索图像-文本匹配的实体级空间关系
Yaxian Xia, Lun Huang, Wenmin Wang, Xiao-Yong Wei, Jie Chen
Exploring the entity-level (i.e., objects in an image, words in a text) spatial relationship contributes to understanding multimedia content precisely. The ignorance of spatial information in previous works probably leads to misunderstandings of image contents. For instance, sentences ‘Boats are on the water’ and ‘Boats are under the water’ describe the same objects, but correspond to different sceneries. To this end, we utilize the relative position of objects to capture entity-level spatial relationships for image-text matching. Specifically, we fuse semantic and spatial relationships of image objects in a visual intra-modal relation module. The module performs promisingly to understand image contents and improve object representation learning. It contributes to capturing entity-level latent correspondence of image-text pairs. Then the query (text) plays a role of textual context to refine the interpretable alignments of image-text pairs in the inter-modal relation module. Our proposed method achieves state-of-the-art results on MSCOCO and Flickr30K datasets.
探索实体层面(即图像中的对象,文本中的单词)的空间关系有助于准确地理解多媒体内容。以往作品对空间信息的忽视,可能会导致对图像内容的误解。例如,句子“船在水上”和“船在水下”描述的是相同的物体,但对应的是不同的风景。为此,我们利用对象的相对位置来捕获实体级空间关系以进行图像-文本匹配。具体来说,我们将图像对象的语义和空间关系融合在一个视觉模态内关系模块中。该模块在理解图像内容和改进对象表示学习方面表现良好。它有助于捕获图像-文本对的实体级潜在对应关系。然后,查询(文本)充当文本上下文的角色,在模态关系模块中细化图像-文本对的可解释对齐。我们提出的方法在MSCOCO和Flickr30K数据集上获得了最先进的结果。
{"title":"Exploring Entity-Level Spatial Relationships for Image-Text Matching","authors":"Yaxian Xia, Lun Huang, Wenmin Wang, Xiao-Yong Wei, Jie Chen","doi":"10.1109/ICASSP40776.2020.9054758","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054758","url":null,"abstract":"Exploring the entity-level (i.e., objects in an image, words in a text) spatial relationship contributes to understanding multimedia content precisely. The ignorance of spatial information in previous works probably leads to misunderstandings of image contents. For instance, sentences ‘Boats are on the water’ and ‘Boats are under the water’ describe the same objects, but correspond to different sceneries. To this end, we utilize the relative position of objects to capture entity-level spatial relationships for image-text matching. Specifically, we fuse semantic and spatial relationships of image objects in a visual intra-modal relation module. The module performs promisingly to understand image contents and improve object representation learning. It contributes to capturing entity-level latent correspondence of image-text pairs. Then the query (text) plays a role of textual context to refine the interpretable alignments of image-text pairs in the inter-modal relation module. Our proposed method achieves state-of-the-art results on MSCOCO and Flickr30K datasets.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"429 1","pages":"4452-4456"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77238665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Deep Metric Learning Based On Center-Ranked Loss for Gait Recognition 基于中心排序损失的深度度量学习步态识别
Jingran Su, Yang Zhao, Xuelong Li
Gait information has gradually attracted people’s attention duing to its uniqueness. Methods based on deep metric learning are successfully utlized in gait recognition tasks. However, most of the previous studies use losses which only consider a small number of samples in the mini-batch, such as Triplet loss and Quadruplet Loss, which is not conducive to the convergence of the model. Therefore, in this paper, a novel loss named Center-ranked is proposed to integrate all positive and negative samples information. We also propose a simple model for gait recognition tasks to verify the validity of the loss. Extensive experiments on two challenging datasets CASIA-B and OU-MVLP demonstrate the superiority and effectiveness of our proposed Center-ranked loss and model.
步态信息以其独特性逐渐引起人们的重视。基于深度度量学习的方法成功地应用于步态识别任务中。然而,以往的研究大多只考虑小批量中少量样本的损失,如Triplet loss、Quadruplet loss等,不利于模型的收敛性。因此,本文提出了一种新的损失算法,即center - ranking来整合所有正、负样本信息。我们还提出了一个简单的步态识别任务模型来验证损失的有效性。在CASIA-B和OU-MVLP两个具有挑战性的数据集上进行的大量实验证明了我们提出的中心排序损失和模型的优越性和有效性。
{"title":"Deep Metric Learning Based On Center-Ranked Loss for Gait Recognition","authors":"Jingran Su, Yang Zhao, Xuelong Li","doi":"10.1109/ICASSP40776.2020.9054645","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054645","url":null,"abstract":"Gait information has gradually attracted people’s attention duing to its uniqueness. Methods based on deep metric learning are successfully utlized in gait recognition tasks. However, most of the previous studies use losses which only consider a small number of samples in the mini-batch, such as Triplet loss and Quadruplet Loss, which is not conducive to the convergence of the model. Therefore, in this paper, a novel loss named Center-ranked is proposed to integrate all positive and negative samples information. We also propose a simple model for gait recognition tasks to verify the validity of the loss. Extensive experiments on two challenging datasets CASIA-B and OU-MVLP demonstrate the superiority and effectiveness of our proposed Center-ranked loss and model.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"153 1","pages":"4077-4081"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80996845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A Novel Two-Pathway Encoder-Decoder Network for 3D Face Reconstruction 一种用于三维人脸重建的新型双路径编码器-解码器网络
Xianfeng Li, Zichun Weng, Juntao Liang, Lei Cei, Youjun Xiang, Yuli Fu
3D Morphable Model (3DMM) is a statistical tool widely employed in reconstructing 3D face shape. Existing methods are aimed at predicting 3DMM shape parameters with a single encoder but suffer from unclear distinction of different attributes. To address this problem, Two-Pathway Encoder-Decoder Network (2PEDN) is proposed to regress the identity and expression components via global and local pathways. Specifically, each 2D face image is cropped into global face and local details as the inputs for the corresponding pathways. 2PEDN is trained to predict 3D face shape components with two sets of loss functions designed to supervise 3D face reconstruction error and face identification error. To reduce the conflict between abundant facial details and saving computer storage space, a magnitudes converter is devised. Experiments demonstrate that the proposed method outperforms several 3D face recontruction methods.
三维变形模型(3DMM)是一种广泛应用于三维脸型重建的统计工具。现有方法的目的是利用单个编码器预测3DMM形状参数,但存在不同属性区分不清的问题。为了解决这个问题,提出了双路径编码器-解码器网络(2PEDN),通过全局和局部路径回归身份和表达成分。具体来说,每个二维人脸图像被裁剪成全局人脸和局部细节作为相应路径的输入。2PEDN通过两组损失函数对三维人脸重建误差和人脸识别误差进行训练来预测三维人脸形状分量。为了减少丰富的面部细节与节省计算机存储空间之间的冲突,设计了一种数量级转换器。实验表明,该方法优于几种三维人脸重建方法。
{"title":"A Novel Two-Pathway Encoder-Decoder Network for 3D Face Reconstruction","authors":"Xianfeng Li, Zichun Weng, Juntao Liang, Lei Cei, Youjun Xiang, Yuli Fu","doi":"10.1109/ICASSP40776.2020.9053699","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053699","url":null,"abstract":"3D Morphable Model (3DMM) is a statistical tool widely employed in reconstructing 3D face shape. Existing methods are aimed at predicting 3DMM shape parameters with a single encoder but suffer from unclear distinction of different attributes. To address this problem, Two-Pathway Encoder-Decoder Network (2PEDN) is proposed to regress the identity and expression components via global and local pathways. Specifically, each 2D face image is cropped into global face and local details as the inputs for the corresponding pathways. 2PEDN is trained to predict 3D face shape components with two sets of loss functions designed to supervise 3D face reconstruction error and face identification error. To reduce the conflict between abundant facial details and saving computer storage space, a magnitudes converter is devised. Experiments demonstrate that the proposed method outperforms several 3D face recontruction methods.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"186 1","pages":"3682-3686"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81077269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast and Stable Blind Source Separation with Rank-1 Updates 具有Rank-1更新的快速稳定盲源分离
Robin Scheibler, Nobutaka Ono
We propose a new algorithm for the blind source separation of acoustic sources. This algorithm is an alternative to the popular auxiliary function based independent vector analysis using iterative projection (AuxIVA-IP). It optimizes the same cost function, but instead of alternate updates of the rows of the demixing matrix, we propose a sequence of rank-1 updates. Remarkably, and unlike the previous method, the resulting updates do not require matrix inversion. Moreover, their computational complexity is quadratic in the number of microphones, rather than cubic in AuxIVA-IP. In addition, we show that the new method can be derived as alternate updates of the steering vectors of sources. Accordingly, we name the method iterative source steering (AuxIVA-ISS). Finally, we confirm in simulated experiments that the proposed algorithm separates sources just as well as AuxIVA-IP, at a lower computational cost.
提出了一种新的声源盲分离算法。该算法是目前流行的基于迭代投影的辅助函数独立向量分析(AuxIVA-IP)的替代算法。它优化了相同的代价函数,但我们提出了一个秩1更新序列,而不是对分解矩阵的行进行交替更新。值得注意的是,与之前的方法不同,结果更新不需要矩阵反转。此外,它们的计算复杂度在麦克风数量上是二次的,而在AuxIVA-IP中是三次的。此外,我们还证明了新方法可以作为源的转向向量的交替更新来推导。因此,我们将该方法命名为迭代源导向(AuxIVA-ISS)。最后,我们在模拟实验中证实,该算法在计算成本较低的情况下,可以像AuxIVA-IP一样分离源。
{"title":"Fast and Stable Blind Source Separation with Rank-1 Updates","authors":"Robin Scheibler, Nobutaka Ono","doi":"10.1109/ICASSP40776.2020.9053556","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053556","url":null,"abstract":"We propose a new algorithm for the blind source separation of acoustic sources. This algorithm is an alternative to the popular auxiliary function based independent vector analysis using iterative projection (AuxIVA-IP). It optimizes the same cost function, but instead of alternate updates of the rows of the demixing matrix, we propose a sequence of rank-1 updates. Remarkably, and unlike the previous method, the resulting updates do not require matrix inversion. Moreover, their computational complexity is quadratic in the number of microphones, rather than cubic in AuxIVA-IP. In addition, we show that the new method can be derived as alternate updates of the steering vectors of sources. Accordingly, we name the method iterative source steering (AuxIVA-ISS). Finally, we confirm in simulated experiments that the proposed algorithm separates sources just as well as AuxIVA-IP, at a lower computational cost.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"51 1","pages":"236-240"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81126826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
期刊
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1