首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Graph Grouping Loss for Metric Learning of Face Image Representations 面向人脸图像表示度量学习的图分组损失
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301861
Nakamasa Inoue
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L2 normalized. In experiments, we demonstrate the effectiveness of the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.
本文提出了度量学习的图分组(GG)损失及其在人脸验证中的应用。GG损失通过构建和优化表示图像之间关系的图,使同一身份的图像嵌入彼此接近,不同身份的图像嵌入彼此远离。此外,为了降低计算成本,我们提出了一种有效的方法来计算L2归一化嵌入的GG损失。在实验中,我们证明了该方法在VoxCeleb数据集上进行人脸验证的有效性。结果表明,所提出的GG损失优于传统的度量学习损失。
{"title":"Graph Grouping Loss for Metric Learning of Face Image Representations","authors":"Nakamasa Inoue","doi":"10.1109/VCIP49819.2020.9301861","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301861","url":null,"abstract":"This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L2 normalized. In experiments, we demonstrate the effectiveness of the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128142040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatiotemporal Guided Self-Supervised Depth Completion from LiDAR and Monocular Camera 基于激光雷达和单目相机的时空制导自监督深度完成
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301857
Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu
Depth completion aims to estimate dense depth maps from sparse depth measurements. It has become increasingly important in autonomous driving and thus has drawn wide attention. In this paper, we introduce photometric losses in both spatial and time domains to jointly guide self-supervised depth completion. This method performs an accurate end-to-end depth completion of vision tasks by using LiDAR and a monocular camera. In particular, we full utilize the consistent information inside the temporally adjacent frames and the stereo vision to improve the accuracy of depth completion in the model training phase. We design a self-supervised framework to eliminate the negative effects of moving objects and the region with smooth gradients. Experiments are conducted on KITTI. Results indicate that our self-supervised method can attain competitive performance.
深度补全的目的是从稀疏的深度测量中估计密集的深度图。它在自动驾驶中变得越来越重要,因此引起了广泛的关注。在本文中,我们引入了空间和时间域的光度损失来共同指导自监督深度完成。该方法利用激光雷达和单目摄像机对视觉任务进行精确的端到端深度完成。特别是在模型训练阶段,我们充分利用时间相邻帧内的一致性信息和立体视觉来提高深度补全的精度。我们设计了一个自监督框架来消除运动物体和平滑梯度区域的负面影响。在KITTI上进行了实验。结果表明,本文提出的自监督方法能够取得较好的效果。
{"title":"Spatiotemporal Guided Self-Supervised Depth Completion from LiDAR and Monocular Camera","authors":"Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu","doi":"10.1109/VCIP49819.2020.9301857","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301857","url":null,"abstract":"Depth completion aims to estimate dense depth maps from sparse depth measurements. It has become increasingly important in autonomous driving and thus has drawn wide attention. In this paper, we introduce photometric losses in both spatial and time domains to jointly guide self-supervised depth completion. This method performs an accurate end-to-end depth completion of vision tasks by using LiDAR and a monocular camera. In particular, we full utilize the consistent information inside the temporally adjacent frames and the stereo vision to improve the accuracy of depth completion in the model training phase. We design a self-supervised framework to eliminate the negative effects of moving objects and the region with smooth gradients. Experiments are conducted on KITTI. Results indicate that our self-supervised method can attain competitive performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128686063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Deep Learning-Based Nonlinear Transform for HEVC Intra Coding 基于深度学习的HEVC内部编码非线性变换
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301790
Kun-Min Yang, Dong Liu, Feng Wu
In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.
在混合视频编码框架中,采用变换来利用输入信号内部的依赖性。本文提出了一种基于深度学习的非线性编码方法。具体来说,我们将方向信息纳入残差域。然后,设计了一个卷积神经网络模型,以实现比传统的离散余弦变换更好的去相关和能量压缩。这项工作有两个主要贡献。首先,我们提出使用内预测信号来降低残差中的方向性。其次,我们提出了一种新的损失函数来表征训练过程中变换的效率。为了评估所提出的变换的压缩性能,我们将其实现到高效视频编码参考软件中。实验结果表明,该方法可将自然视频的bd率降低1.79%。
{"title":"Deep Learning-Based Nonlinear Transform for HEVC Intra Coding","authors":"Kun-Min Yang, Dong Liu, Feng Wu","doi":"10.1109/VCIP49819.2020.9301790","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301790","url":null,"abstract":"In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132902787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ERP-Based CTU Splitting Early Termination for Intra Prediction of 360 videos 基于erp的360视频帧内预测的CTU分割提前终止
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301879
Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino
This work presents an Equirectangular projection (ERP) based Coding Tree Unit (CTU) splitting early termination algorithm for the High Efficiency Video Coding (HEVC) intra prediction of 360-degree videos. The proposed algorithm adaptively employs early termination in the HEVC CTU splitting based on distortion properties of the ERP projection, that generate homogeneous regions at the top and bottom portion of a video frame. Experimental results show an average of 24% time saving with 0.11% coding efficiency loss, significantly reducing the encoding complexity with minor impacts in the encoding efficiency. Besides, solution presents the best results considering the relation between time saving and coding efficiency when compared with all related works.
本文提出了一种基于等矩形投影(ERP)的编码树单元(CTU)分割提前终止算法,用于360度视频的高效视频编码(HEVC)帧内预测。该算法基于ERP投影的畸变特性,自适应地在HEVC CTU分割中采用早期终止,在视频帧的上下部分生成均匀区域。实验结果表明,该方法平均节省24%的时间,编码效率损失0.11%,显著降低了编码复杂度,对编码效率影响较小。此外,与所有相关工作相比,该方案考虑了节省时间和编码效率的关系,具有最佳效果。
{"title":"ERP-Based CTU Splitting Early Termination for Intra Prediction of 360 videos","authors":"Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino","doi":"10.1109/VCIP49819.2020.9301879","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301879","url":null,"abstract":"This work presents an Equirectangular projection (ERP) based Coding Tree Unit (CTU) splitting early termination algorithm for the High Efficiency Video Coding (HEVC) intra prediction of 360-degree videos. The proposed algorithm adaptively employs early termination in the HEVC CTU splitting based on distortion properties of the ERP projection, that generate homogeneous regions at the top and bottom portion of a video frame. Experimental results show an average of 24% time saving with 0.11% coding efficiency loss, significantly reducing the encoding complexity with minor impacts in the encoding efficiency. Besides, solution presents the best results considering the relation between time saving and coding efficiency when compared with all related works.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127806240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
From Low to Super Resolution and Beyond 从低到超分辨率和更高
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301878
C. Kok, Wing-Shan Tam
The tutorial starts with an introduction of digital image interpolation, and single image super-resolution. It continues with the definition of various image interpolation performance measurement indices, including both objective and subjective indices. The core of this tutorial is the application of covariance based interpolation to achieve high visual quality image interpolation and single image super-resolution results. Layer on layer, the covariance based edge-directed image interpolation techniques that makes use of stochastic image model without explicit edge map, to iterative covariance correction based image interpolation. The edge based interpolation incorporated human visual system to achieve visually pleasant high resolution interpolation results. On each layer, the pros and cons of each image model and interpolation technique, solutions to alleviate the interpolation visual artifacts of each techniques, and innovative modification to overcome limitations of traditional edge-directed image interpolation techniques are presented in this tutorial, which includes: spatial adaptive pixel intensity estimation, pixel intensity correction, error propagation mitigation, covariance windows adaptation, and iterative covariance correction. The tutorial will extend from theoretical and analytical discussions to detail implementation using MATLAB. The audience shall be able to bring home with implementation details, as well as the performance and complexity of the interpolation algorithms discussed in this tutorial.
本教程从介绍数字图像插值和单图像超分辨率开始。接着定义了各种图像插值性能测量指标,包括客观指标和主观指标。本教程的核心是基于协方差插值的应用,以实现高视觉质量的图像插值和单图像超分辨率结果。将基于协方差的图像边缘插值技术逐层递进到基于协方差校正的迭代图像插值。基于边缘的插值结合了人的视觉系统,获得了视觉愉悦的高分辨率插值结果。在每一层,本教程介绍了每种图像模型和插值技术的优缺点,减轻每种技术的插值视觉伪影的解决方案,以及克服传统边缘定向图像插值技术局限性的创新修改,包括:空间自适应像素强度估计,像素强度校正,误差传播减缓,协方差窗口自适应和迭代协方差校正。本教程将从理论和分析讨论扩展到使用MATLAB的详细实现。观众应该能够带回家实现细节,以及性能和本教程中讨论的插值算法的复杂性。
{"title":"From Low to Super Resolution and Beyond","authors":"C. Kok, Wing-Shan Tam","doi":"10.1109/VCIP49819.2020.9301878","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301878","url":null,"abstract":"The tutorial starts with an introduction of digital image interpolation, and single image super-resolution. It continues with the definition of various image interpolation performance measurement indices, including both objective and subjective indices. The core of this tutorial is the application of covariance based interpolation to achieve high visual quality image interpolation and single image super-resolution results. Layer on layer, the covariance based edge-directed image interpolation techniques that makes use of stochastic image model without explicit edge map, to iterative covariance correction based image interpolation. The edge based interpolation incorporated human visual system to achieve visually pleasant high resolution interpolation results. On each layer, the pros and cons of each image model and interpolation technique, solutions to alleviate the interpolation visual artifacts of each techniques, and innovative modification to overcome limitations of traditional edge-directed image interpolation techniques are presented in this tutorial, which includes: spatial adaptive pixel intensity estimation, pixel intensity correction, error propagation mitigation, covariance windows adaptation, and iterative covariance correction. The tutorial will extend from theoretical and analytical discussions to detail implementation using MATLAB. The audience shall be able to bring home with implementation details, as well as the performance and complexity of the interpolation algorithms discussed in this tutorial.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131867500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNN-Based Anomaly Detection For Face Presentation Attack Detection With Multi-Channel Images 基于cnn的多通道人脸呈现攻击检测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301818
Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen
Recently, face recognition systems have received significant attention, and there have been many works focused on presentation attacks (PAs). However, the generalization capacity of PAs is still challenging in real scenarios, as the attack samples in the training database may not cover all possible PAs. In this paper, we propose to perform the face presentation attack detection (PAD) with multi-channel images using the convolutional neural network based anomaly detection. Multi-channel images endow us with rich information to distinguish between different mode of attacks, and the anomaly detection based technique ensures the generalization performance. We evaluate the performance of our methods using the wide multi-channel presentation attack (WMCA) dataset.
近年来,人脸识别系统受到了广泛的关注,并有许多研究集中在表现攻击(PAs)方面。然而,在实际场景中,由于训练数据库中的攻击样本可能无法覆盖所有可能的pa,因此pa的泛化能力仍然具有挑战性。在本文中,我们提出使用基于卷积神经网络的异常检测对多通道图像进行人脸呈现攻击检测(PAD)。多通道图像为我们区分不同的攻击方式提供了丰富的信息,基于异常检测的技术保证了算法的泛化性能。我们使用宽多通道表示攻击(WMCA)数据集评估我们的方法的性能。
{"title":"CNN-Based Anomaly Detection For Face Presentation Attack Detection With Multi-Channel Images","authors":"Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen","doi":"10.1109/VCIP49819.2020.9301818","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301818","url":null,"abstract":"Recently, face recognition systems have received significant attention, and there have been many works focused on presentation attacks (PAs). However, the generalization capacity of PAs is still challenging in real scenarios, as the attack samples in the training database may not cover all possible PAs. In this paper, we propose to perform the face presentation attack detection (PAD) with multi-channel images using the convolutional neural network based anomaly detection. Multi-channel images endow us with rich information to distinguish between different mode of attacks, and the anomaly detection based technique ensures the generalization performance. We evaluate the performance of our methods using the wide multi-channel presentation attack (WMCA) dataset.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134235938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Orthogonal Features Fusion Network for Anomaly Detection 基于正交特征融合网络的异常检测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301755
Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann
Generative models have been successfully used for anomaly detection, which however need a large number of parameters and computation overheads, especially when training spatial and temporal networks in the same framework. In this paper, we introduce a novel network architecture, Orthogonal Features Fusion Network (OFF-Net), to solve the anomaly detection problem. We show that the convolutional feature maps used for generating future frames are orthogonal with each other, which can improve representation capacity of generative models and strengthen temporal connections between adjacent images. We lead a simple but effective module easily mounted on convolutional neural networks (CNNs) with negligible additional parameters added, which can replace the widely-used optical flow n etwork a nd s ignificantly im prove th e pe rformance for anomaly detection. Extensive experiment results demonstrate the effectiveness of OFF-Net that we outperform the state-of-the-art model 1.7% in terms of AUC. We save around 85M-space parameters compared with the prevailing prior arts using optical flow n etwork w ithout c omprising t he performance.
生成模型已被成功地用于异常检测,但异常检测需要大量的参数和计算开销,特别是在同一框架下训练时空网络时。本文引入了一种新的网络结构——正交特征融合网络(OFF-Net)来解决异常检测问题。我们证明了用于生成未来帧的卷积特征映射彼此正交,这可以提高生成模型的表示能力并加强相邻图像之间的时间连接。我们设计了一个简单有效的模块,可以轻松地安装在卷积神经网络(cnn)上,附加的参数可以忽略不计,可以取代广泛使用的光流网络,显著提高异常检测的性能。大量的实验结果证明了OFF-Net的有效性,我们在AUC方面比最先进的模型高出1.7%。与使用光流网络w的现有技术相比,我们节省了大约85m的空间参数,而不影响性能。
{"title":"Orthogonal Features Fusion Network for Anomaly Detection","authors":"Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann","doi":"10.1109/VCIP49819.2020.9301755","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301755","url":null,"abstract":"Generative models have been successfully used for anomaly detection, which however need a large number of parameters and computation overheads, especially when training spatial and temporal networks in the same framework. In this paper, we introduce a novel network architecture, Orthogonal Features Fusion Network (OFF-Net), to solve the anomaly detection problem. We show that the convolutional feature maps used for generating future frames are orthogonal with each other, which can improve representation capacity of generative models and strengthen temporal connections between adjacent images. We lead a simple but effective module easily mounted on convolutional neural networks (CNNs) with negligible additional parameters added, which can replace the widely-used optical flow n etwork a nd s ignificantly im prove th e pe rformance for anomaly detection. Extensive experiment results demonstrate the effectiveness of OFF-Net that we outperform the state-of-the-art model 1.7% in terms of AUC. We save around 85M-space parameters compared with the prevailing prior arts using optical flow n etwork w ithout c omprising t he performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129733259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stereoscopic image reflection removal based on Wasserstein Generative Adversarial Network 基于Wasserstein生成对抗网络的立体图像反射去除
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301892
Xiuyuan Wang, Yikun Pan, D. Lun
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the- art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.
反射去除是计算机视觉中一个长期存在的问题。本文研究了立体图像的反射去除问题。利用立体图像的深度信息,提出了一种新的基于Wasserstein生成对抗网络(WGAN)的背景边缘估计算法,用于区分背景图像的边缘和反射。然后利用背景边缘重建背景图像。我们将所提出的方法与最先进的反射去除方法进行了比较。结果表明,该方法不仅优于传统的基于单图像的方法,而且与基于多图像的方法相当,同时具有更简单的成像硬件要求。
{"title":"Stereoscopic image reflection removal based on Wasserstein Generative Adversarial Network","authors":"Xiuyuan Wang, Yikun Pan, D. Lun","doi":"10.1109/VCIP49819.2020.9301892","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301892","url":null,"abstract":"Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the- art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning the Connectivity: Situational Graph Convolution Network for Facial Expression Recognition 学习连通性:情景图卷积网络用于面部表情识别
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301773
Jinzhao Zhou, Xingming Zhang, Yang Liu
Previous studies recognizing expressions with facial graph topology mostly use a fixed facial graph structure established by the physical dependencies among facial landmarks. However, the static graph structure inherently lacks flexibility in non-standardized scenarios. This paper proposes a dynamic-graph-based method for effective and robust facial expression recognition. To capture action-specific dependencies among facial components, we introduce a link inference structure, called the Situational Link Generation Module (SLGM). We further propose the Situational Graph Convolution Network (SGCN) to automatically detect and recognize facial expression in various conditions. Experimental evaluations on two lab-constrained datasets, CK+ and Oulu, along with an in-the-wild dataset, AFEW, show the superior performance of the proposed method. Additional experiments on occluded facial images further demonstrate the robustness of our strategy.
以往基于面部图拓扑的表情识别研究大多采用由面部标志之间的物理依赖关系建立的固定的面部图结构。然而,静态图结构在非标准化场景中固有地缺乏灵活性。提出了一种基于动态图的有效鲁棒面部表情识别方法。为了捕获面部组件之间特定于动作的依赖关系,我们引入了一个链接推理结构,称为情境链接生成模块(SLGM)。我们进一步提出情景图卷积网络(Situational Graph Convolution Network, SGCN)来自动检测和识别各种情况下的面部表情。在两个实验室约束数据集(CK+和Oulu)以及野外数据集(AFEW)上进行的实验评估表明,所提出的方法具有优越的性能。在被遮挡的面部图像上的实验进一步证明了我们的策略的鲁棒性。
{"title":"Learning the Connectivity: Situational Graph Convolution Network for Facial Expression Recognition","authors":"Jinzhao Zhou, Xingming Zhang, Yang Liu","doi":"10.1109/VCIP49819.2020.9301773","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301773","url":null,"abstract":"Previous studies recognizing expressions with facial graph topology mostly use a fixed facial graph structure established by the physical dependencies among facial landmarks. However, the static graph structure inherently lacks flexibility in non-standardized scenarios. This paper proposes a dynamic-graph-based method for effective and robust facial expression recognition. To capture action-specific dependencies among facial components, we introduce a link inference structure, called the Situational Link Generation Module (SLGM). We further propose the Situational Graph Convolution Network (SGCN) to automatically detect and recognize facial expression in various conditions. Experimental evaluations on two lab-constrained datasets, CK+ and Oulu, along with an in-the-wild dataset, AFEW, show the superior performance of the proposed method. Additional experiments on occluded facial images further demonstrate the robustness of our strategy.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"13 3-4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122345591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Convolution Feature Aggregation via Edge Attention Convolution Network for Person Re-Identification 基于边缘注意卷积网络学习卷积特征聚合的人物再识别
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301848
Chaoqun Lin, R. Guo, Mingkun Li, Xianbiao Qi, Chun-Guang Li
Person Re-Identification (Re-ID) is a challenging task of matching pedestrian images collected from nonoverlapping multiple camera views due to huge variations from pose changes, occlusions, varying illumination and clutter background. Recently, graph convolution network or graph neural network increasingly gains a lot of research attention in person Re-ID. However, the existing methods have not fully exploit the available features on the graph. In this paper, we propose an efficient and effective end-to-end trainable framework, termed Edge Attention Convolution Network (EACN), to perform convolution feature learning and attentive feature aggregation for person Re-ID, in which the learned convolution features on vertex and its edges are attentively aggregated on a dynamic graph. We conduct extensive experiments on two large benchmark datasets, Market-1501 and DukeMTMC. Experimental results validate the efficiency and effectiveness of our proposal.
人物再识别(Re-ID)是一项具有挑战性的任务,由于姿势变化,遮挡,光照变化和杂波背景的巨大变化,从非重叠的多个相机视图中收集行人图像进行匹配。近年来,图卷积网络或图神经网络在个人身份识别中越来越受到关注。然而,现有的方法并没有充分利用图上的可用特征。在本文中,我们提出了一个高效的端到端可训练框架,称为边缘注意卷积网络(EACN),用于对人Re-ID进行卷积特征学习和注意特征聚合,该框架将学习到的顶点及其边缘的卷积特征集中在一个动态图上。我们在两个大型基准数据集Market-1501和DukeMTMC上进行了广泛的实验。实验结果验证了该方法的有效性和有效性。
{"title":"Learning Convolution Feature Aggregation via Edge Attention Convolution Network for Person Re-Identification","authors":"Chaoqun Lin, R. Guo, Mingkun Li, Xianbiao Qi, Chun-Guang Li","doi":"10.1109/VCIP49819.2020.9301848","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301848","url":null,"abstract":"Person Re-Identification (Re-ID) is a challenging task of matching pedestrian images collected from nonoverlapping multiple camera views due to huge variations from pose changes, occlusions, varying illumination and clutter background. Recently, graph convolution network or graph neural network increasingly gains a lot of research attention in person Re-ID. However, the existing methods have not fully exploit the available features on the graph. In this paper, we propose an efficient and effective end-to-end trainable framework, termed Edge Attention Convolution Network (EACN), to perform convolution feature learning and attentive feature aggregation for person Re-ID, in which the learned convolution features on vertex and its edges are attentively aggregated on a dynamic graph. We conduct extensive experiments on two large benchmark datasets, Market-1501 and DukeMTMC. Experimental results validate the efficiency and effectiveness of our proposal.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128983936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1