首页 > 最新文献

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献

英文 中文
Dense 3D Point Cloud Reconstruction Using a Deep Pyramid Network 密集三维点云重建使用深度金字塔网络
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00117
Priyanka Mandikal, R. Venkatesh Babu
Reconstructing a high-resolution 3D model of an object is a challenging task in computer vision. Designing scalable and light-weight architectures is crucial while addressing this problem. Existing point-cloud based reconstruction approaches directly predict the entire point cloud in a single stage. Although this technique can handle low-resolution point clouds, it is not a viable solution for generating dense, high-resolution outputs. In this work, we introduce DensePCR, a deep pyramidal network for point cloud reconstruction that hierarchically predicts point clouds of increasing resolution. Towards this end, we propose an architecture that first predicts a low-resolution point cloud, and then hierarchically increases the resolution by aggregating local and global point features to deform a grid. Our method generates point clouds that are accurate, uniform and dense. Through extensive quantitative and qualitative evaluation on synthetic and real datasets, we demonstrate that DensePCR outperforms the existing state-of-the-art point cloud reconstruction works, while also providing a light-weight and scalable architecture for predicting high-resolution outputs.
在计算机视觉中,重建物体的高分辨率3D模型是一项具有挑战性的任务。在解决这个问题时,设计可伸缩的轻量级架构至关重要。现有的基于点云的重建方法在单个阶段直接预测整个点云。虽然这种技术可以处理低分辨率的点云,但对于生成密集的高分辨率输出来说,它不是一个可行的解决方案。在这项工作中,我们引入了DensePCR,这是一种用于点云重建的深度金字塔网络,可以分层预测分辨率增加的点云。为此,我们提出了一种架构,该架构首先预测低分辨率点云,然后通过聚合局部和全局点特征来分层地提高分辨率以变形网格。我们的方法生成了精确、均匀和密集的点云。通过对合成数据集和真实数据集进行广泛的定量和定性评估,我们证明了DensePCR优于现有最先进的点云重建工作,同时还为预测高分辨率输出提供了轻量级和可扩展的架构。
{"title":"Dense 3D Point Cloud Reconstruction Using a Deep Pyramid Network","authors":"Priyanka Mandikal, R. Venkatesh Babu","doi":"10.1109/WACV.2019.00117","DOIUrl":"https://doi.org/10.1109/WACV.2019.00117","url":null,"abstract":"Reconstructing a high-resolution 3D model of an object is a challenging task in computer vision. Designing scalable and light-weight architectures is crucial while addressing this problem. Existing point-cloud based reconstruction approaches directly predict the entire point cloud in a single stage. Although this technique can handle low-resolution point clouds, it is not a viable solution for generating dense, high-resolution outputs. In this work, we introduce DensePCR, a deep pyramidal network for point cloud reconstruction that hierarchically predicts point clouds of increasing resolution. Towards this end, we propose an architecture that first predicts a low-resolution point cloud, and then hierarchically increases the resolution by aggregating local and global point features to deform a grid. Our method generates point clouds that are accurate, uniform and dense. Through extensive quantitative and qualitative evaluation on synthetic and real datasets, we demonstrate that DensePCR outperforms the existing state-of-the-art point cloud reconstruction works, while also providing a light-weight and scalable architecture for predicting high-resolution outputs.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114631883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
DAFE-FD: Density Aware Feature Enrichment for Face Detection 基于密度感知特征的人脸检测
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00236
Vishwanath A. Sindagi, Vishal M. Patel
Recent research on face detection, which is focused primarily on improving accuracy of detecting smaller faces, attempt to develop new anchor design strategies to facilitate increased overlap between anchor boxes and ground truth faces of smaller sizes. In this work, we approach the problem of small face detection with the motivation of enriching the feature maps using a density map estimation module. This module, inspired by recent crowd counting/density estimation techniques, performs the task of estimating the per pixel density of people/faces present in the image. Output of this module is employed to accentuate the feature maps from the backbone network using a feature enrichment module before being used for detecting smaller faces. The proposed approach can be used to complement recent anchor-design based novel methods to further improve their results. Experiments conducted on different datasets such as WIDER, FDDB and Pascal-Faces demonstrate the effectiveness of the proposed approach.
最近的人脸检测研究主要集中在提高检测小人脸的准确性上,试图开发新的锚点设计策略,以促进锚点盒与较小尺寸的地面真面之间的重叠。在这项工作中,我们使用密度图估计模块来丰富特征图的动机来解决小人脸检测问题。该模块受到最近人群计数/密度估计技术的启发,执行估计图像中存在的人/脸的每像素密度的任务。该模块的输出使用特征富集模块对主干网的特征映射进行强化,然后用于检测较小的人脸。所提出的方法可用于补充最近基于锚设计的新方法,以进一步改善其结果。在wide、FDDB和Pascal-Faces等不同数据集上进行的实验证明了该方法的有效性。
{"title":"DAFE-FD: Density Aware Feature Enrichment for Face Detection","authors":"Vishwanath A. Sindagi, Vishal M. Patel","doi":"10.1109/WACV.2019.00236","DOIUrl":"https://doi.org/10.1109/WACV.2019.00236","url":null,"abstract":"Recent research on face detection, which is focused primarily on improving accuracy of detecting smaller faces, attempt to develop new anchor design strategies to facilitate increased overlap between anchor boxes and ground truth faces of smaller sizes. In this work, we approach the problem of small face detection with the motivation of enriching the feature maps using a density map estimation module. This module, inspired by recent crowd counting/density estimation techniques, performs the task of estimating the per pixel density of people/faces present in the image. Output of this module is employed to accentuate the feature maps from the backbone network using a feature enrichment module before being used for detecting smaller faces. The proposed approach can be used to complement recent anchor-design based novel methods to further improve their results. Experiments conducted on different datasets such as WIDER, FDDB and Pascal-Faces demonstrate the effectiveness of the proposed approach.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122526248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Latent Fingerprint Enhancement Using Generative Adversarial Networks 基于生成对抗网络的潜在指纹增强
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00100
Indu Joshi, A. Anand, Mayank Vatsa, Richa Singh, Sumantra Dutta Roy, P. Kalra
Latent fingerprints recognition is very useful in law enforcement and forensics applications. However, automated matching of latent fingerprints with a gallery of live scan images is very challenging due to several compounding factors such as noisy background, poor ridge structure, and overlapping unstructured noise. In order to efficiently match latent fingerprints, an effective enhancement module is a necessity so that it can facilitate correct minutiae extraction. In this research, we propose a Generative Adversarial Network based latent fingerprint enhancement algorithm to enhance the poor quality ridges and predict the ridge information. Experiments on two publicly available datasets, IIITD-MOLF and IIITD-MSLFD show that the proposed enhancement algorithm improves the fingerprints quality while preserving the ridge structure. It helps the standard feature extraction and matching algorithms to boost latent fingerprints matching performance.
潜在指纹识别在执法和取证应用中非常有用。然而,由于背景噪声、脊结构差和重叠的非结构化噪声等多种因素的影响,潜在指纹与实时扫描图像的自动匹配非常具有挑战性。为了有效地匹配潜在指纹,需要一个有效的增强模块,以便正确提取细节。在本研究中,我们提出了一种基于生成对抗网络的潜在指纹增强算法来增强质量较差的指纹脊并预测指纹脊信息。在IIITD-MOLF和IIITD-MSLFD两个公开数据集上的实验表明,该增强算法在保留指纹脊结构的同时提高了指纹质量。它有助于标准的特征提取和匹配算法,以提高潜在指纹匹配性能。
{"title":"Latent Fingerprint Enhancement Using Generative Adversarial Networks","authors":"Indu Joshi, A. Anand, Mayank Vatsa, Richa Singh, Sumantra Dutta Roy, P. Kalra","doi":"10.1109/WACV.2019.00100","DOIUrl":"https://doi.org/10.1109/WACV.2019.00100","url":null,"abstract":"Latent fingerprints recognition is very useful in law enforcement and forensics applications. However, automated matching of latent fingerprints with a gallery of live scan images is very challenging due to several compounding factors such as noisy background, poor ridge structure, and overlapping unstructured noise. In order to efficiently match latent fingerprints, an effective enhancement module is a necessity so that it can facilitate correct minutiae extraction. In this research, we propose a Generative Adversarial Network based latent fingerprint enhancement algorithm to enhance the poor quality ridges and predict the ridge information. Experiments on two publicly available datasets, IIITD-MOLF and IIITD-MSLFD show that the proposed enhancement algorithm improves the fingerprints quality while preserving the ridge structure. It helps the standard feature extraction and matching algorithms to boost latent fingerprints matching performance.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124268960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Recovering Faces From Portraits with Auxiliary Facial Attributes 从具有辅助面部属性的肖像中恢复面部
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00049
Fatemeh Shiri, Xin Yu, F. Porikli, R. Hartley, Piotr Koniusz
Recovering a photorealistic face from an artistic portrait is a challenging task since crucial facial details are often distorted or completely lost in artistic compositions. To handle this loss, we propose an Attribute-guided Face Recovery from Portraits (AFRP) that utilizes a Face Recovery Network (FRN) and a Discriminative Network (DN). FRN consists of an autoencoder with residual block-embedded skip-connections and incorporates facial attribute vectors into the feature maps of input portraits at the bottleneck of the autoencoder. DN has multiple convolutional and fully-connected layers, and its role is to enforce FRN to generate authentic face images with corresponding facial attributes dictated by the input attribute vectors. For the preservation of identities, we impose the recovered and ground-truth faces to share similar visual features. Specifically, DN determines whether the recovered image looks like a real face and checks if the facial attributes extracted from the recovered image are consistent with given attributes. Our method can recover photorealistic identity-preserving faces with desired attributes from unseen stylized portraits, artistic paintings, and hand-drawn sketches. On large-scale synthesized and sketch datasets, we demonstrate that our face recovery method achieves state-of-the-art results.
从艺术肖像中恢复逼真的面部是一项具有挑战性的任务,因为关键的面部细节经常在艺术构图中扭曲或完全丢失。为了处理这一损失,我们提出了一种利用人脸恢复网络(FRN)和判别网络(DN)的属性引导人脸从肖像中恢复(AFRP)。FRN由残差块嵌入的跳脱连接的自编码器组成,并在自编码器的瓶颈处将人脸属性向量集成到输入人像的特征映射中。DN具有多个卷积和全连接层,其作用是强制FRN生成具有输入属性向量所指示的相应面部属性的真实人脸图像。为了保持身份,我们将恢复的和真实的人脸强制使用,以共享相似的视觉特征。具体来说,DN判断恢复图像是否与真实人脸相似,并检查从恢复图像中提取的面部属性是否与给定属性一致。我们的方法可以从未见过的风格化肖像、艺术绘画和手绘草图中恢复具有所需属性的逼真的身份保留脸。在大规模合成和草图数据集上,我们证明了我们的人脸恢复方法达到了最先进的结果。
{"title":"Recovering Faces From Portraits with Auxiliary Facial Attributes","authors":"Fatemeh Shiri, Xin Yu, F. Porikli, R. Hartley, Piotr Koniusz","doi":"10.1109/WACV.2019.00049","DOIUrl":"https://doi.org/10.1109/WACV.2019.00049","url":null,"abstract":"Recovering a photorealistic face from an artistic portrait is a challenging task since crucial facial details are often distorted or completely lost in artistic compositions. To handle this loss, we propose an Attribute-guided Face Recovery from Portraits (AFRP) that utilizes a Face Recovery Network (FRN) and a Discriminative Network (DN). FRN consists of an autoencoder with residual block-embedded skip-connections and incorporates facial attribute vectors into the feature maps of input portraits at the bottleneck of the autoencoder. DN has multiple convolutional and fully-connected layers, and its role is to enforce FRN to generate authentic face images with corresponding facial attributes dictated by the input attribute vectors. For the preservation of identities, we impose the recovered and ground-truth faces to share similar visual features. Specifically, DN determines whether the recovered image looks like a real face and checks if the facial attributes extracted from the recovered image are consistent with given attributes. Our method can recover photorealistic identity-preserving faces with desired attributes from unseen stylized portraits, artistic paintings, and hand-drawn sketches. On large-scale synthesized and sketch datasets, we demonstrate that our face recovery method achieves state-of-the-art results.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122319415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification 基于gan的视频人物再识别姿态感知调节
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00130
Alessandro Borgia, Yang Hua, Elyor Kodirov, N. Robertson
Video-based person re-identification deals with the inherent difficulty of matching sequences with different length, unregulated, and incomplete target pose/viewpoint structure. Common approaches operate either by reducing the problem to the still images case, facing a significant information loss, or by exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural Networks or in gait analysis. However, in all cases, the inter-sequences pose/viewpoint misalignment is considered, and the existing spatial approaches are mostly limited to the still images context. To this end, we propose a novel approach that can exploit more effectively the rich video information, by accounting for the role that the changing pose/viewpoint factor plays in the sequences matching process. In particular, our approach consists of two components. The first one attempts to complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their features vectors into a more discriminative viewpoint-insensitive embedding, namely Weighted Fusion (WF). Another one performs an explicit pose-based alignment of sequence pairs to promote coherent feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments on two large video-based benchmark datasets show that our approach outperforms considerably existing methods.
基于视频的人物再识别解决了不同长度、不规范和不完整的目标姿态/视点结构序列匹配的固有困难。常见的方法是将问题减少到静态图像的情况下,面临重大的信息损失,或者利用序列间的时间依赖性,如在暹罗递归神经网络或步态分析中。然而,在所有情况下,考虑到序列间的位姿/视点不对齐,现有的空间方法大多局限于静止图像上下文。为此,我们提出了一种新的方法,通过考虑变化的姿态/视点因素在序列匹配过程中所起的作用,可以更有效地利用丰富的视频信息。具体来说,我们的方法由两个部分组成。第一种方法尝试用合成的gan生成的图像来补充序列所携带的原始姿态不完全信息,并将它们的特征向量融合到一个更具判别性的视点不敏感嵌入中,即加权融合(Weighted Fusion, WF)。另一种方法是对序列对进行明确的基于姿态的对齐,以促进连贯的特征匹配,即加权姿态调节(WPR)。在两个大型基于视频的基准数据集上进行的大量实验表明,我们的方法大大优于现有的方法。
{"title":"GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification","authors":"Alessandro Borgia, Yang Hua, Elyor Kodirov, N. Robertson","doi":"10.1109/WACV.2019.00130","DOIUrl":"https://doi.org/10.1109/WACV.2019.00130","url":null,"abstract":"Video-based person re-identification deals with the inherent difficulty of matching sequences with different length, unregulated, and incomplete target pose/viewpoint structure. Common approaches operate either by reducing the problem to the still images case, facing a significant information loss, or by exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural Networks or in gait analysis. However, in all cases, the inter-sequences pose/viewpoint misalignment is considered, and the existing spatial approaches are mostly limited to the still images context. To this end, we propose a novel approach that can exploit more effectively the rich video information, by accounting for the role that the changing pose/viewpoint factor plays in the sequences matching process. In particular, our approach consists of two components. The first one attempts to complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their features vectors into a more discriminative viewpoint-insensitive embedding, namely Weighted Fusion (WF). Another one performs an explicit pose-based alignment of sequence pairs to promote coherent feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments on two large video-based benchmark datasets show that our approach outperforms considerably existing methods.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"363 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
FgGAN: A Cascaded Unpaired Learning for Background Estimation and Foreground Segmentation FgGAN:一种用于背景估计和前景分割的级联非配对学习
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00193
Prashant W. Patil, S. Murala
The moving object segmentation (MOS) in videos with bad weather, irregular motion of objects, camera jitter, shadow and dynamic background scenarios is still an open problem for computer vision applications. To address these issues, in this paper, we propose an approach named as Foreground Generative Adversarial Network (FgGAN) with the recent concepts of generative adversarial network (GAN) and unpaired training for background estimation and foreground segmentation. To the best of our knowledge, this is the first paper with the concept of GAN-based unpaired learning for MOS. Initially, video-wise background is estimated using GAN-based unpaired learning network (network-I). Then, to extract the motion information related to foreground, motion saliency is estimated using estimated background and current video frame. Further, estimated motion saliency is given as input to the GANbased unpaired learning network (network-II) for foreground segmentation. To examine the effectiveness of proposed FgGAN (cascaded networks I and II), the challenging video categories like dynamic background, bad weather, intermittent object motion and shadow are collected from ChangeDetection.net-2014 [26] database. The segmentation accuracy is observed qualitatively and quantitatively in terms of F-measure and percentage of wrong classification (PWC) and compared with the existing state-of-the-art methods. From experimental results, it is evident that the proposed FgGAN shows significant improvement in terms of F-measure and PWC as compared to the existing stateof-the-art methods for MOS.
在恶劣天气、物体不规则运动、摄像机抖动、阴影和动态背景场景下的视频运动目标分割(MOS)仍然是计算机视觉应用的一个开放性问题。为了解决这些问题,在本文中,我们提出了一种名为前景生成对抗网络(FgGAN)的方法,该方法结合了生成对抗网络(GAN)和背景估计和前景分割的非配对训练的最新概念。据我们所知,这是第一篇提出基于gan的非配对学习的MOS概念的论文。首先,使用基于gan的非配对学习网络(network- i)估计视频智能背景。然后,利用估计的背景和当前视频帧估计运动显著性,提取与前景相关的运动信息;此外,将估计的运动显著性作为输入输入到基于gan的不成对学习网络(network- ii)中,用于前景分割。为了检验所提出的FgGAN(级联网络I和II)的有效性,我们从ChangeDetection.net-2014[26]数据库中收集了动态背景、恶劣天气、间歇物体运动和阴影等具有挑战性的视频类别。通过f度量和错误分类百分比(PWC)对分割精度进行定性和定量观察,并与现有最先进的方法进行比较。从实验结果来看,与现有的最先进的MOS方法相比,所提出的FgGAN在F-measure和PWC方面表现出显着的改进。
{"title":"FgGAN: A Cascaded Unpaired Learning for Background Estimation and Foreground Segmentation","authors":"Prashant W. Patil, S. Murala","doi":"10.1109/WACV.2019.00193","DOIUrl":"https://doi.org/10.1109/WACV.2019.00193","url":null,"abstract":"The moving object segmentation (MOS) in videos with bad weather, irregular motion of objects, camera jitter, shadow and dynamic background scenarios is still an open problem for computer vision applications. To address these issues, in this paper, we propose an approach named as Foreground Generative Adversarial Network (FgGAN) with the recent concepts of generative adversarial network (GAN) and unpaired training for background estimation and foreground segmentation. To the best of our knowledge, this is the first paper with the concept of GAN-based unpaired learning for MOS. Initially, video-wise background is estimated using GAN-based unpaired learning network (network-I). Then, to extract the motion information related to foreground, motion saliency is estimated using estimated background and current video frame. Further, estimated motion saliency is given as input to the GANbased unpaired learning network (network-II) for foreground segmentation. To examine the effectiveness of proposed FgGAN (cascaded networks I and II), the challenging video categories like dynamic background, bad weather, intermittent object motion and shadow are collected from ChangeDetection.net-2014 [26] database. The segmentation accuracy is observed qualitatively and quantitatively in terms of F-measure and percentage of wrong classification (PWC) and compared with the existing state-of-the-art methods. From experimental results, it is evident that the proposed FgGAN shows significant improvement in terms of F-measure and PWC as compared to the existing stateof-the-art methods for MOS.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128451602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Conditional Generative Adversarial Refinement Networks for Unbalanced Medical Image Semantic Segmentation 不平衡医学图像语义分割的条件生成对抗优化网络
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00200
Mina Rezaei, Haojin Yang, Konstantin Harmuth, C. Meinel
We propose a new generative adversarial architecture to mitigate imbalance data problem in medical image semantic segmentation where the majority of pixels belongs to a healthy region and few belong to lesion or non-health region. A model trained with imbalanced data tends to bias towards healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low sensitivity. We propose a new conditional generative refinement network with three components: a generative, a discriminative, and a refinement networks to mitigate imbalanced data problem through ensemble learning. The generative network learns to the segment at the pixel level by getting feedback from the discriminative network according to the true positive and true negative maps. On the other hand, the refinement network learns to predict the false positive and the false negative masks produced by the generative network that has significant value, especially in medical application. The final semantic segmentation masks are then composed by the output of the three networks. The proposed architecture shows state-of-the-art results on LiTS-2017 for simultaneous liver and lesion segmentation, and MDA231 for microscopic cell segmentation. We have achieved competitive results on BraTS-2017 for brain tumor segmentation.
针对医学图像语义分割中健康区域像素多、病变或非健康区域像素少的数据不平衡问题,提出了一种新的生成对抗结构。使用不平衡数据训练的模型往往偏向于临床应用中不需要的健康数据,并且这些网络的预测输出精度高,灵敏度低。我们提出了一种新的条件生成优化网络,它由三个组成部分组成:生成网络、判别网络和优化网络,通过集成学习来缓解数据不平衡问题。生成网络根据真正映射和真负映射得到判别网络的反馈,在像素级学习分段。另一方面,细化网络学习预测生成网络产生的假阳性和假阴性掩模,具有重要的价值,特别是在医学应用中。最后的语义分割掩码由三个网络的输出组成。所提出的架构显示了LiTS-2017用于肝脏和病变同时分割,MDA231用于微观细胞分割的最新结果。我们在BraTS-2017脑肿瘤分割方面取得了具有竞争力的成果。
{"title":"Conditional Generative Adversarial Refinement Networks for Unbalanced Medical Image Semantic Segmentation","authors":"Mina Rezaei, Haojin Yang, Konstantin Harmuth, C. Meinel","doi":"10.1109/WACV.2019.00200","DOIUrl":"https://doi.org/10.1109/WACV.2019.00200","url":null,"abstract":"We propose a new generative adversarial architecture to mitigate imbalance data problem in medical image semantic segmentation where the majority of pixels belongs to a healthy region and few belong to lesion or non-health region. A model trained with imbalanced data tends to bias towards healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low sensitivity. We propose a new conditional generative refinement network with three components: a generative, a discriminative, and a refinement networks to mitigate imbalanced data problem through ensemble learning. The generative network learns to the segment at the pixel level by getting feedback from the discriminative network according to the true positive and true negative maps. On the other hand, the refinement network learns to predict the false positive and the false negative masks produced by the generative network that has significant value, especially in medical application. The final semantic segmentation masks are then composed by the output of the three networks. The proposed architecture shows state-of-the-art results on LiTS-2017 for simultaneous liver and lesion segmentation, and MDA231 for microscopic cell segmentation. We have achieved competitive results on BraTS-2017 for brain tumor segmentation.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Starts Better and Ends Better: A Target Adaptive Image Signature Tracker 开始更好,结束更好:目标自适应图像签名跟踪器
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00024
Xingchao Liu, Ce Li, Hongren Wang, Xiantong Zhen, Baochang Zhang, Qixiang Ye
Correlation filter (CF) trackers have achieved outstanding performance in visual object tracking tasks, in which the cosine mask plays an essential role in alleviating boundary effects caused by the circular assumption. However, the cosine mask imposes a larger weight on its center position, which greatly affects CF trackers, that is, their performance will drop significantly if a bad starting point happens to occur. To address the above issue, we propose a target adaptive image signature (TaiS) model to refine the starting point in each frame for CF trackers. Specifically, we incorporate the target priori into the image signature to build a target-specific saliency map, and iteratively refine the starting point with a closed-form solution during the tracking process. As a result, our TaiS is able to find a better starting point close to the center of targets; more importantly, it is independent of specific CF trackers and can efficiently improve their performance. Experiments on two benchmark datasets, i.e., OTB100 and UAV123, demonstrate that our TaiS consistently achieves high performance and updates the state of the arts in visual tracking. The source code of our approach will be made publicly available.
相关滤波(CF)跟踪器在视觉目标跟踪任务中取得了优异的性能,其中余弦掩模在缓解圆形假设引起的边界效应方面起着至关重要的作用。但是,余弦掩模对其中心位置施加了较大的权重,这对CF跟踪器的影响很大,即如果出现不好的起点,其性能会显著下降。为了解决上述问题,我们提出了一种目标自适应图像签名(TaiS)模型,以细化CF跟踪器每帧中的起始点。具体而言,我们将目标先验纳入图像签名中,构建目标特定的显著性映射,并在跟踪过程中迭代地使用封闭形式解来优化起点。因此,我们的tai能够在靠近目标中心的地方找到更好的起点;更重要的是,它独立于特定的CF跟踪器,可以有效地提高它们的性能。在两个基准数据集(即OTB100和UAV123)上的实验表明,我们的TaiS始终如一地实现了高性能,并更新了视觉跟踪的最新状态。我们的方法的源代码将会公开。
{"title":"Starts Better and Ends Better: A Target Adaptive Image Signature Tracker","authors":"Xingchao Liu, Ce Li, Hongren Wang, Xiantong Zhen, Baochang Zhang, Qixiang Ye","doi":"10.1109/WACV.2019.00024","DOIUrl":"https://doi.org/10.1109/WACV.2019.00024","url":null,"abstract":"Correlation filter (CF) trackers have achieved outstanding performance in visual object tracking tasks, in which the cosine mask plays an essential role in alleviating boundary effects caused by the circular assumption. However, the cosine mask imposes a larger weight on its center position, which greatly affects CF trackers, that is, their performance will drop significantly if a bad starting point happens to occur. To address the above issue, we propose a target adaptive image signature (TaiS) model to refine the starting point in each frame for CF trackers. Specifically, we incorporate the target priori into the image signature to build a target-specific saliency map, and iteratively refine the starting point with a closed-form solution during the tracking process. As a result, our TaiS is able to find a better starting point close to the center of targets; more importantly, it is independent of specific CF trackers and can efficiently improve their performance. Experiments on two benchmark datasets, i.e., OTB100 and UAV123, demonstrate that our TaiS consistently achieves high performance and updates the state of the arts in visual tracking. The source code of our approach will be made publicly available.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115988498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shadow Patching: Guided Image Completion for Shadow Removal 阴影修补:引导图像完成阴影去除
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00217
Ryan S. Hintze, B. Morse
Removing unwanted shadows is a common need in photo editing software. Previous methods handle some shadows well but perform poorly in cases with severe degradation (darker shadowing) because they rely on directly restoring the degraded data in the shadowed region. Image-completion algorithms can completely replace severely degraded shadowed regions, and perform well with smaller-scale textures, but often fail to reproduce larger-scale macrostructure that may still be visible in the shadowed region. This paper provides a general framework that leverages degraded (in this case shadowed) data in a region to guide image completion by extending the objective function commonly used in current state-of-the-art energy-minimization methods for image completion to include not only visual realism but consistency with the original degraded content. This approach achieves realistic-looking shadow removal even in cases of severe degradation where precise recovery of the unshadowed content may not be possible. Although not demonstrated here, the generality of the approach potentially allows it to be extended to other types of localized degradation.
删除不需要的阴影是照片编辑软件的常见需求。以前的方法可以处理一些阴影,但在严重退化(更暗的阴影)的情况下表现不佳,因为它们依赖于直接恢复阴影区域中退化的数据。图像补全算法可以完全替代严重退化的阴影区域,并且在小尺度纹理上表现良好,但往往无法再现阴影区域中可能仍然可见的大尺度宏观结构。本文提供了一个总体框架,通过扩展当前最先进的图像补全能量最小化方法中常用的目标函数,利用区域内退化(在这种情况下是阴影)数据来指导图像补全,不仅包括视觉真实感,还包括与原始退化内容的一致性。这种方法实现了逼真的阴影去除,即使在严重退化的情况下,精确地恢复无阴影的内容可能是不可能的。虽然这里没有演示,但该方法的通用性可能允许它扩展到其他类型的局部退化。
{"title":"Shadow Patching: Guided Image Completion for Shadow Removal","authors":"Ryan S. Hintze, B. Morse","doi":"10.1109/WACV.2019.00217","DOIUrl":"https://doi.org/10.1109/WACV.2019.00217","url":null,"abstract":"Removing unwanted shadows is a common need in photo editing software. Previous methods handle some shadows well but perform poorly in cases with severe degradation (darker shadowing) because they rely on directly restoring the degraded data in the shadowed region. Image-completion algorithms can completely replace severely degraded shadowed regions, and perform well with smaller-scale textures, but often fail to reproduce larger-scale macrostructure that may still be visible in the shadowed region. This paper provides a general framework that leverages degraded (in this case shadowed) data in a region to guide image completion by extending the objective function commonly used in current state-of-the-art energy-minimization methods for image completion to include not only visual realism but consistency with the original degraded content. This approach achieves realistic-looking shadow removal even in cases of severe degradation where precise recovery of the unshadowed content may not be possible. Although not demonstrated here, the generality of the approach potentially allows it to be extended to other types of localized degradation.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114786823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Observing Pianist Accuracy and Form with Computer Vision 用计算机视觉观察钢琴家的准确性和形式
Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00165
Jangwon Lee, Bardia Doosti, Yupeng Gu, David Cartledge, David J. Crandall, C. Raphael
We present a first step towards developing an interactive piano tutoring system that can observe a student playing the piano and give feedback about hand movements and musical accuracy. In particular, we have two primary aims: 1) to determine which notes on a piano are being played at any moment in time, 2) to identify which finger is pressing each note. We introduce a novel two-stream convolutional neural network that takes video and audio inputs together for detecting pressed notes and finger presses. We formulate our two problems in terms of multi-task learning and extend a state-of-the-art object detection model to incorporate both audio and visual features. In addition, we introduce a novel finger identification solution based on pressed piano note information. We experimentally confirm that our approach is able to detect pressed piano keys and the piano player's fingers with a high accuracy.
我们向开发一个交互式钢琴辅导系统迈出了第一步,该系统可以观察学生弹钢琴,并给出手部动作和音乐准确性的反馈。特别是,我们有两个主要目的:1)确定钢琴上的哪个音符在任何时刻被演奏,2)确定哪个手指在按每个音符。我们介绍了一种新的双流卷积神经网络,它将视频和音频输入一起用于检测按下的音符和手指按压。我们从多任务学习的角度阐述了我们的两个问题,并扩展了一个最先进的目标检测模型,以结合音频和视觉特征。此外,我们还介绍了一种新的基于按下钢琴音符信息的手指识别解决方案。我们通过实验证实,我们的方法能够以很高的精度检测按下的钢琴键和钢琴演奏者的手指。
{"title":"Observing Pianist Accuracy and Form with Computer Vision","authors":"Jangwon Lee, Bardia Doosti, Yupeng Gu, David Cartledge, David J. Crandall, C. Raphael","doi":"10.1109/WACV.2019.00165","DOIUrl":"https://doi.org/10.1109/WACV.2019.00165","url":null,"abstract":"We present a first step towards developing an interactive piano tutoring system that can observe a student playing the piano and give feedback about hand movements and musical accuracy. In particular, we have two primary aims: 1) to determine which notes on a piano are being played at any moment in time, 2) to identify which finger is pressing each note. We introduce a novel two-stream convolutional neural network that takes video and audio inputs together for detecting pressed notes and finger presses. We formulate our two problems in terms of multi-task learning and extend a state-of-the-art object detection model to incorporate both audio and visual features. In addition, we introduce a novel finger identification solution based on pressed piano note information. We experimentally confirm that our approach is able to detect pressed piano keys and the piano player's fingers with a high accuracy.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127339597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2019 IEEE Winter Conference on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1