首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Graph Grouping Loss for Metric Learning of Face Image Representations 面向人脸图像表示度量学习的图分组损失
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301861
Nakamasa Inoue
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L2 normalized. In experiments, we demonstrate the effectiveness of the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.
本文提出了度量学习的图分组(GG)损失及其在人脸验证中的应用。GG损失通过构建和优化表示图像之间关系的图,使同一身份的图像嵌入彼此接近,不同身份的图像嵌入彼此远离。此外,为了降低计算成本,我们提出了一种有效的方法来计算L2归一化嵌入的GG损失。在实验中,我们证明了该方法在VoxCeleb数据集上进行人脸验证的有效性。结果表明,所提出的GG损失优于传统的度量学习损失。
{"title":"Graph Grouping Loss for Metric Learning of Face Image Representations","authors":"Nakamasa Inoue","doi":"10.1109/VCIP49819.2020.9301861","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301861","url":null,"abstract":"This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L2 normalized. In experiments, we demonstrate the effectiveness of the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128142040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Near Infrared Colorization with Semantic Segmentation and Transfer Learning 基于语义分割和迁移学习的深近红外着色
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301788
Fengqiao Wang, Lu Liu, Cheolkon Jung
Although near infrared (NIR) images contain no color, they have abundant and clear textures. In this paper, we propose deep NIR colorization with semantic segmentation and transfer learning. NIR images are capable of capturing invisible spectrum (700-1000 nm) that is quite different from visible spectrum images. We employ convolutional layers to build relationship between single NIR images and three-channel color images, instead of mapping to Lab or YCbCr color space. Moreover, we use semantic segmentation as global prior information to refine colorization of smooth regions for objects. We use color divergence loss to further optimize NIR colorization results with good structures and edges. Since the training dataset is not enough to capture rich color information, we adopt transfer learning to get color and semantic information. Experimental results verify that the proposed method produces a natural color image from single NIR image and outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).
虽然近红外(NIR)图像不含颜色,但它们具有丰富而清晰的纹理。本文提出了一种基于语义分割和迁移学习的深度近红外着色方法。近红外图像能够捕获与可见光谱图像截然不同的不可见光谱(700-1000 nm)。我们使用卷积层来建立单个近红外图像和三通道彩色图像之间的关系,而不是映射到Lab或YCbCr颜色空间。此外,我们使用语义分割作为全局先验信息来改进物体光滑区域的着色。我们利用色散损失进一步优化近红外着色结果,使其具有良好的结构和边缘。由于训练数据集不足以捕获丰富的颜色信息,我们采用迁移学习来获取颜色和语义信息。实验结果表明,该方法可以从单幅近红外图像中生成自然彩色图像,并且在峰值信噪比(PSNR)和结构相似性(SSIM)方面优于目前最先进的方法。
{"title":"Deep Near Infrared Colorization with Semantic Segmentation and Transfer Learning","authors":"Fengqiao Wang, Lu Liu, Cheolkon Jung","doi":"10.1109/VCIP49819.2020.9301788","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301788","url":null,"abstract":"Although near infrared (NIR) images contain no color, they have abundant and clear textures. In this paper, we propose deep NIR colorization with semantic segmentation and transfer learning. NIR images are capable of capturing invisible spectrum (700-1000 nm) that is quite different from visible spectrum images. We employ convolutional layers to build relationship between single NIR images and three-channel color images, instead of mapping to Lab or YCbCr color space. Moreover, we use semantic segmentation as global prior information to refine colorization of smooth regions for objects. We use color divergence loss to further optimize NIR colorization results with good structures and edges. Since the training dataset is not enough to capture rich color information, we adopt transfer learning to get color and semantic information. Experimental results verify that the proposed method produces a natural color image from single NIR image and outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples 不同样本类星体光度红移估计的机器学习
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301849
Yanxia Zhang, Xin Jin, Jingyi Zhang, Yongheng Zhao
We compare the performance of Support Vector Machine, XGBoost, LightGBM, k-Nearest Neighbors, Random forests and Extra-Trees on the photometric redshift estimation of quasars based on the SDSS_WISE sample. For this sample, LightGBM shows its superiority in speed while k-Nearest Neighbors, Random forests and Extra-Trees show better performance. Then k-Nearest Neighbors, Random forests and Extra-Trees are applied on the SDSS, SDSS_WISE, SDSS_UKIDSS, WISE_UKIDSS and SDSS_WISE_UKIDSS samples. The results show that the performance of an algorithm depends on the sample selection, sample size, input pattern and information from different bands; for the same sample, the more information the better performance is obtained, but different algorithms shows different accuracy; no single algorithm shows its superiority on every sample.
我们比较了基于SDSS_WISE样本的支持向量机、XGBoost、LightGBM、k近邻、随机森林和Extra-Trees在类星体光度红移估计上的性能。对于这个样本,LightGBM在速度上表现出优势,而k-Nearest Neighbors, Random forests和Extra-Trees表现出更好的性能。然后在SDSS、SDSS_WISE、SDSS_UKIDSS、WISE_UKIDSS和SDSS_WISE_UKIDSS样本上应用k近邻、随机森林和Extra-Trees。结果表明,算法的性能取决于样本选择、样本大小、输入模式和不同波段的信息;对于同一样本,信息越多,性能越好,但不同算法的准确率不同;没有一种算法在所有样本上都表现出优越性。
{"title":"Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples","authors":"Yanxia Zhang, Xin Jin, Jingyi Zhang, Yongheng Zhao","doi":"10.1109/VCIP49819.2020.9301849","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301849","url":null,"abstract":"We compare the performance of Support Vector Machine, XGBoost, LightGBM, k-Nearest Neighbors, Random forests and Extra-Trees on the photometric redshift estimation of quasars based on the SDSS_WISE sample. For this sample, LightGBM shows its superiority in speed while k-Nearest Neighbors, Random forests and Extra-Trees show better performance. Then k-Nearest Neighbors, Random forests and Extra-Trees are applied on the SDSS, SDSS_WISE, SDSS_UKIDSS, WISE_UKIDSS and SDSS_WISE_UKIDSS samples. The results show that the performance of an algorithm depends on the sample selection, sample size, input pattern and information from different bands; for the same sample, the more information the better performance is obtained, but different algorithms shows different accuracy; no single algorithm shows its superiority on every sample.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126570149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ERP-Based CTU Splitting Early Termination for Intra Prediction of 360 videos 基于erp的360视频帧内预测的CTU分割提前终止
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301879
Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino
This work presents an Equirectangular projection (ERP) based Coding Tree Unit (CTU) splitting early termination algorithm for the High Efficiency Video Coding (HEVC) intra prediction of 360-degree videos. The proposed algorithm adaptively employs early termination in the HEVC CTU splitting based on distortion properties of the ERP projection, that generate homogeneous regions at the top and bottom portion of a video frame. Experimental results show an average of 24% time saving with 0.11% coding efficiency loss, significantly reducing the encoding complexity with minor impacts in the encoding efficiency. Besides, solution presents the best results considering the relation between time saving and coding efficiency when compared with all related works.
本文提出了一种基于等矩形投影(ERP)的编码树单元(CTU)分割提前终止算法,用于360度视频的高效视频编码(HEVC)帧内预测。该算法基于ERP投影的畸变特性,自适应地在HEVC CTU分割中采用早期终止,在视频帧的上下部分生成均匀区域。实验结果表明,该方法平均节省24%的时间,编码效率损失0.11%,显著降低了编码复杂度,对编码效率影响较小。此外,与所有相关工作相比,该方案考虑了节省时间和编码效率的关系,具有最佳效果。
{"title":"ERP-Based CTU Splitting Early Termination for Intra Prediction of 360 videos","authors":"Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino","doi":"10.1109/VCIP49819.2020.9301879","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301879","url":null,"abstract":"This work presents an Equirectangular projection (ERP) based Coding Tree Unit (CTU) splitting early termination algorithm for the High Efficiency Video Coding (HEVC) intra prediction of 360-degree videos. The proposed algorithm adaptively employs early termination in the HEVC CTU splitting based on distortion properties of the ERP projection, that generate homogeneous regions at the top and bottom portion of a video frame. Experimental results show an average of 24% time saving with 0.11% coding efficiency loss, significantly reducing the encoding complexity with minor impacts in the encoding efficiency. Besides, solution presents the best results considering the relation between time saving and coding efficiency when compared with all related works.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127806240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Stereoscopic image reflection removal based on Wasserstein Generative Adversarial Network 基于Wasserstein生成对抗网络的立体图像反射去除
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301892
Xiuyuan Wang, Yikun Pan, D. Lun
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the- art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.
反射去除是计算机视觉中一个长期存在的问题。本文研究了立体图像的反射去除问题。利用立体图像的深度信息,提出了一种新的基于Wasserstein生成对抗网络(WGAN)的背景边缘估计算法,用于区分背景图像的边缘和反射。然后利用背景边缘重建背景图像。我们将所提出的方法与最先进的反射去除方法进行了比较。结果表明,该方法不仅优于传统的基于单图像的方法,而且与基于多图像的方法相当,同时具有更简单的成像硬件要求。
{"title":"Stereoscopic image reflection removal based on Wasserstein Generative Adversarial Network","authors":"Xiuyuan Wang, Yikun Pan, D. Lun","doi":"10.1109/VCIP49819.2020.9301892","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301892","url":null,"abstract":"Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the- art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning-Based Nonlinear Transform for HEVC Intra Coding 基于深度学习的HEVC内部编码非线性变换
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301790
Kun-Min Yang, Dong Liu, Feng Wu
In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.
在混合视频编码框架中,采用变换来利用输入信号内部的依赖性。本文提出了一种基于深度学习的非线性编码方法。具体来说,我们将方向信息纳入残差域。然后,设计了一个卷积神经网络模型,以实现比传统的离散余弦变换更好的去相关和能量压缩。这项工作有两个主要贡献。首先,我们提出使用内预测信号来降低残差中的方向性。其次,我们提出了一种新的损失函数来表征训练过程中变换的效率。为了评估所提出的变换的压缩性能,我们将其实现到高效视频编码参考软件中。实验结果表明,该方法可将自然视频的bd率降低1.79%。
{"title":"Deep Learning-Based Nonlinear Transform for HEVC Intra Coding","authors":"Kun-Min Yang, Dong Liu, Feng Wu","doi":"10.1109/VCIP49819.2020.9301790","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301790","url":null,"abstract":"In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132902787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
CNN-Based Anomaly Detection For Face Presentation Attack Detection With Multi-Channel Images 基于cnn的多通道人脸呈现攻击检测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301818
Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen
Recently, face recognition systems have received significant attention, and there have been many works focused on presentation attacks (PAs). However, the generalization capacity of PAs is still challenging in real scenarios, as the attack samples in the training database may not cover all possible PAs. In this paper, we propose to perform the face presentation attack detection (PAD) with multi-channel images using the convolutional neural network based anomaly detection. Multi-channel images endow us with rich information to distinguish between different mode of attacks, and the anomaly detection based technique ensures the generalization performance. We evaluate the performance of our methods using the wide multi-channel presentation attack (WMCA) dataset.
近年来,人脸识别系统受到了广泛的关注,并有许多研究集中在表现攻击(PAs)方面。然而,在实际场景中,由于训练数据库中的攻击样本可能无法覆盖所有可能的pa,因此pa的泛化能力仍然具有挑战性。在本文中,我们提出使用基于卷积神经网络的异常检测对多通道图像进行人脸呈现攻击检测(PAD)。多通道图像为我们区分不同的攻击方式提供了丰富的信息,基于异常检测的技术保证了算法的泛化性能。我们使用宽多通道表示攻击(WMCA)数据集评估我们的方法的性能。
{"title":"CNN-Based Anomaly Detection For Face Presentation Attack Detection With Multi-Channel Images","authors":"Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen","doi":"10.1109/VCIP49819.2020.9301818","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301818","url":null,"abstract":"Recently, face recognition systems have received significant attention, and there have been many works focused on presentation attacks (PAs). However, the generalization capacity of PAs is still challenging in real scenarios, as the attack samples in the training database may not cover all possible PAs. In this paper, we propose to perform the face presentation attack detection (PAD) with multi-channel images using the convolutional neural network based anomaly detection. Multi-channel images endow us with rich information to distinguish between different mode of attacks, and the anomaly detection based technique ensures the generalization performance. We evaluate the performance of our methods using the wide multi-channel presentation attack (WMCA) dataset.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134235938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
From Low to Super Resolution and Beyond 从低到超分辨率和更高
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301878
C. Kok, Wing-Shan Tam
The tutorial starts with an introduction of digital image interpolation, and single image super-resolution. It continues with the definition of various image interpolation performance measurement indices, including both objective and subjective indices. The core of this tutorial is the application of covariance based interpolation to achieve high visual quality image interpolation and single image super-resolution results. Layer on layer, the covariance based edge-directed image interpolation techniques that makes use of stochastic image model without explicit edge map, to iterative covariance correction based image interpolation. The edge based interpolation incorporated human visual system to achieve visually pleasant high resolution interpolation results. On each layer, the pros and cons of each image model and interpolation technique, solutions to alleviate the interpolation visual artifacts of each techniques, and innovative modification to overcome limitations of traditional edge-directed image interpolation techniques are presented in this tutorial, which includes: spatial adaptive pixel intensity estimation, pixel intensity correction, error propagation mitigation, covariance windows adaptation, and iterative covariance correction. The tutorial will extend from theoretical and analytical discussions to detail implementation using MATLAB. The audience shall be able to bring home with implementation details, as well as the performance and complexity of the interpolation algorithms discussed in this tutorial.
本教程从介绍数字图像插值和单图像超分辨率开始。接着定义了各种图像插值性能测量指标,包括客观指标和主观指标。本教程的核心是基于协方差插值的应用,以实现高视觉质量的图像插值和单图像超分辨率结果。将基于协方差的图像边缘插值技术逐层递进到基于协方差校正的迭代图像插值。基于边缘的插值结合了人的视觉系统,获得了视觉愉悦的高分辨率插值结果。在每一层,本教程介绍了每种图像模型和插值技术的优缺点,减轻每种技术的插值视觉伪影的解决方案,以及克服传统边缘定向图像插值技术局限性的创新修改,包括:空间自适应像素强度估计,像素强度校正,误差传播减缓,协方差窗口自适应和迭代协方差校正。本教程将从理论和分析讨论扩展到使用MATLAB的详细实现。观众应该能够带回家实现细节,以及性能和本教程中讨论的插值算法的复杂性。
{"title":"From Low to Super Resolution and Beyond","authors":"C. Kok, Wing-Shan Tam","doi":"10.1109/VCIP49819.2020.9301878","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301878","url":null,"abstract":"The tutorial starts with an introduction of digital image interpolation, and single image super-resolution. It continues with the definition of various image interpolation performance measurement indices, including both objective and subjective indices. The core of this tutorial is the application of covariance based interpolation to achieve high visual quality image interpolation and single image super-resolution results. Layer on layer, the covariance based edge-directed image interpolation techniques that makes use of stochastic image model without explicit edge map, to iterative covariance correction based image interpolation. The edge based interpolation incorporated human visual system to achieve visually pleasant high resolution interpolation results. On each layer, the pros and cons of each image model and interpolation technique, solutions to alleviate the interpolation visual artifacts of each techniques, and innovative modification to overcome limitations of traditional edge-directed image interpolation techniques are presented in this tutorial, which includes: spatial adaptive pixel intensity estimation, pixel intensity correction, error propagation mitigation, covariance windows adaptation, and iterative covariance correction. The tutorial will extend from theoretical and analytical discussions to detail implementation using MATLAB. The audience shall be able to bring home with implementation details, as well as the performance and complexity of the interpolation algorithms discussed in this tutorial.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131867500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning the Connectivity: Situational Graph Convolution Network for Facial Expression Recognition 学习连通性:情景图卷积网络用于面部表情识别
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301773
Jinzhao Zhou, Xingming Zhang, Yang Liu
Previous studies recognizing expressions with facial graph topology mostly use a fixed facial graph structure established by the physical dependencies among facial landmarks. However, the static graph structure inherently lacks flexibility in non-standardized scenarios. This paper proposes a dynamic-graph-based method for effective and robust facial expression recognition. To capture action-specific dependencies among facial components, we introduce a link inference structure, called the Situational Link Generation Module (SLGM). We further propose the Situational Graph Convolution Network (SGCN) to automatically detect and recognize facial expression in various conditions. Experimental evaluations on two lab-constrained datasets, CK+ and Oulu, along with an in-the-wild dataset, AFEW, show the superior performance of the proposed method. Additional experiments on occluded facial images further demonstrate the robustness of our strategy.
以往基于面部图拓扑的表情识别研究大多采用由面部标志之间的物理依赖关系建立的固定的面部图结构。然而,静态图结构在非标准化场景中固有地缺乏灵活性。提出了一种基于动态图的有效鲁棒面部表情识别方法。为了捕获面部组件之间特定于动作的依赖关系,我们引入了一个链接推理结构,称为情境链接生成模块(SLGM)。我们进一步提出情景图卷积网络(Situational Graph Convolution Network, SGCN)来自动检测和识别各种情况下的面部表情。在两个实验室约束数据集(CK+和Oulu)以及野外数据集(AFEW)上进行的实验评估表明,所提出的方法具有优越的性能。在被遮挡的面部图像上的实验进一步证明了我们的策略的鲁棒性。
{"title":"Learning the Connectivity: Situational Graph Convolution Network for Facial Expression Recognition","authors":"Jinzhao Zhou, Xingming Zhang, Yang Liu","doi":"10.1109/VCIP49819.2020.9301773","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301773","url":null,"abstract":"Previous studies recognizing expressions with facial graph topology mostly use a fixed facial graph structure established by the physical dependencies among facial landmarks. However, the static graph structure inherently lacks flexibility in non-standardized scenarios. This paper proposes a dynamic-graph-based method for effective and robust facial expression recognition. To capture action-specific dependencies among facial components, we introduce a link inference structure, called the Situational Link Generation Module (SLGM). We further propose the Situational Graph Convolution Network (SGCN) to automatically detect and recognize facial expression in various conditions. Experimental evaluations on two lab-constrained datasets, CK+ and Oulu, along with an in-the-wild dataset, AFEW, show the superior performance of the proposed method. Additional experiments on occluded facial images further demonstrate the robustness of our strategy.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"13 3-4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122345591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fast compressed sensing recovery using generative models and sparse deviations modeling 基于生成模型和稀疏偏差建模的快速压缩感知恢复
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301808
Lei Cai, Yuli Fu, Youjun Xiang, Tao Zhu, Xianfeng Li, Huanqiang Zeng
This paper develops an algorithm to effectively explore the advantages of both sparse vector recovery methods and generative model-based recovery methods for solving compressed sensing recovery problem. The proposed algorithm mainly consists of two steps. In the first step, a network-based projected gradient descent (NPGD) is introduced to solve a non-convex optimization problem, obtaining a preliminary recovery of the original signal. Then with the obtained preliminary recovery, a l1 norm regularized optimization problem is solved by optimizing for sparse deviation vectors. Experimental results on two bench-mark datasets for image compressed sensing clearly demonstrate that the proposed recovery algorithm can bring about high computation speed, while decreasing the reconstruction error continuously with increasing the number of measurements.
本文开发了一种算法,以有效地探索稀疏向量恢复方法和基于生成模型的恢复方法在解决压缩感知恢复问题中的优势。该算法主要包括两个步骤。第一步,引入基于网络的投影梯度下降(NPGD)来解决非凸优化问题,获得原始信号的初步恢复。然后利用得到的初步恢复,通过对稀疏偏差向量进行优化,求解l1范数正则化优化问题。在两个图像压缩感知基准数据集上的实验结果清楚地表明,所提出的恢复算法可以带来较高的计算速度,同时随着测量次数的增加,重构误差不断减小。
{"title":"Fast compressed sensing recovery using generative models and sparse deviations modeling","authors":"Lei Cai, Yuli Fu, Youjun Xiang, Tao Zhu, Xianfeng Li, Huanqiang Zeng","doi":"10.1109/VCIP49819.2020.9301808","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301808","url":null,"abstract":"This paper develops an algorithm to effectively explore the advantages of both sparse vector recovery methods and generative model-based recovery methods for solving compressed sensing recovery problem. The proposed algorithm mainly consists of two steps. In the first step, a network-based projected gradient descent (NPGD) is introduced to solve a non-convex optimization problem, obtaining a preliminary recovery of the original signal. Then with the obtained preliminary recovery, a l1 norm regularized optimization problem is solved by optimizing for sparse deviation vectors. Experimental results on two bench-mark datasets for image compressed sensing clearly demonstrate that the proposed recovery algorithm can bring about high computation speed, while decreasing the reconstruction error continuously with increasing the number of measurements.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127496626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1