首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
A Novel Visual Analysis Oriented Rate Control Scheme for HEVC 一种新的面向可视化分析的HEVC速率控制方案
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301817
Qi Zhang, Shanshe Wang, Siwei Ma
Recent years have witnessed an explosion of machine visual intelligence. While impressive performance on visual analysis has been achieved by powerful Deep-Learning-based models, the texture and feature distortion caused by image and video coding is becoming a challenge in practical situations. In this paper, a new rate control scheme is proposed to improve visual analysis performance on coded video frames. Firstly, a new kind of visual analysis distortion is introduced to build a Rate-Joint-Distortion model. Secondly, the Rate-Joint-Distortion Optimization problem is solved by using Lagrange multiplier method, and the relationship between rate and Lagrange multiplier λ is described by a hyperbolic model. Thirdly, a logarithmic λ − QP model is established to achieve minimum Rate-Joint-Distortion cost for given λs. The experimental results show that the proposed scheme can improve visual analysis performance with stable bits used for coding.
近年来,机器视觉智能出现了爆炸式增长。虽然基于深度学习的强大模型在视觉分析方面取得了令人印象深刻的表现,但在实际情况下,图像和视频编码引起的纹理和特征失真正在成为一个挑战。为了提高编码视频帧的视觉分析性能,本文提出了一种新的码率控制方案。首先,引入了一种新的视觉分析畸变,建立了速率-联合畸变模型。其次,采用拉格朗日乘子法求解速率-联合畸变优化问题,用双曲模型描述速率与拉格朗日乘子λ之间的关系。第三,建立了对数λ−QP模型,以实现给定λs的最小速率联合失真代价。实验结果表明,采用稳定比特进行编码,可以提高视觉分析性能。
{"title":"A Novel Visual Analysis Oriented Rate Control Scheme for HEVC","authors":"Qi Zhang, Shanshe Wang, Siwei Ma","doi":"10.1109/VCIP49819.2020.9301817","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301817","url":null,"abstract":"Recent years have witnessed an explosion of machine visual intelligence. While impressive performance on visual analysis has been achieved by powerful Deep-Learning-based models, the texture and feature distortion caused by image and video coding is becoming a challenge in practical situations. In this paper, a new rate control scheme is proposed to improve visual analysis performance on coded video frames. Firstly, a new kind of visual analysis distortion is introduced to build a Rate-Joint-Distortion model. Secondly, the Rate-Joint-Distortion Optimization problem is solved by using Lagrange multiplier method, and the relationship between rate and Lagrange multiplier λ is described by a hyperbolic model. Thirdly, a logarithmic λ − QP model is established to achieve minimum Rate-Joint-Distortion cost for given λs. The experimental results show that the proposed scheme can improve visual analysis performance with stable bits used for coding.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114042014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Drone-Based Car Counting via Density Map Learning 基于密度地图学习的无人机汽车计数
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301785
Jingxian Huang, Guanchen Ding, Yujia Guo, Daiqin Yang, Sihan Wang, Tao Wang, Yunfei Zhang
Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point maps with a Gaussian kernel for later model learning (generation). Then, the counting network learns to predict density maps from input images (estimation). Most studies focus on the estimation problem while overlooking the generation problem. In this paper, a training framework is proposed to generate density maps by learning and train generation and estimation subnetworks jointly. Experiments demonstrate that our method outperforms other density map-based methods and shows the best performance on drone-based car counting.
在计算机视觉中,基于无人机图像的汽车计数是一项具有挑战性的任务。最先进的计数方法是基于密度图。通常,密度图首先是通过将地面真值点映射与高斯核进行卷积来生成的,以便稍后进行模型学习(生成)。然后,计数网络学习从输入图像中预测密度图(估计)。大多数研究都集中在估计问题上,而忽略了生成问题。本文提出了一种通过学习和训练生成和估计子网来生成密度图的训练框架。实验表明,我们的方法优于其他基于密度图的方法,并在基于无人机的汽车计数上表现出最佳性能。
{"title":"Drone-Based Car Counting via Density Map Learning","authors":"Jingxian Huang, Guanchen Ding, Yujia Guo, Daiqin Yang, Sihan Wang, Tao Wang, Yunfei Zhang","doi":"10.1109/VCIP49819.2020.9301785","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301785","url":null,"abstract":"Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point maps with a Gaussian kernel for later model learning (generation). Then, the counting network learns to predict density maps from input images (estimation). Most studies focus on the estimation problem while overlooking the generation problem. In this paper, a training framework is proposed to generate density maps by learning and train generation and estimation subnetworks jointly. Experiments demonstrate that our method outperforms other density map-based methods and shows the best performance on drone-based car counting.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123180446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
VCIP 2020 Index VCIP 2020指数
Pub Date : 2020-12-01 DOI: 10.1109/vcip49819.2020.9301896
{"title":"VCIP 2020 Index","authors":"","doi":"10.1109/vcip49819.2020.9301896","DOIUrl":"https://doi.org/10.1109/vcip49819.2020.9301896","url":null,"abstract":"","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129770108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UGNet: Underexposed Images Enhancement Network based on Global Illumination Estimation 基于全局光照估计的欠曝光图像增强网络
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301810
Yuan Fang, Wenzhe Zhu, Qing Zhu
This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the input image, and implement an end-to-end learning network model. Based on this model, we formulate a multi-term loss function that combines content, color, texture and smoothness losses. Our extensive experiments demonstrate that this method is superior to other methods in underexposed image enhancement. It can cover more color details and be applied to various underexposed images robustly.
本文提出了一种新的神经网络增强欠曝光图像。采用平滑扩展卷积来估计输入图像的全局光照,实现了端到端的学习网络模型,取代了基于Retinex理论的分解方法。在此模型的基础上,我们建立了一个综合了内容、颜色、纹理和平滑损失的多项损失函数。大量的实验表明,该方法在欠曝光图像增强方面优于其他方法。它可以覆盖更多的色彩细节,并适用于各种曝光不足的图像。
{"title":"UGNet: Underexposed Images Enhancement Network based on Global Illumination Estimation","authors":"Yuan Fang, Wenzhe Zhu, Qing Zhu","doi":"10.1109/VCIP49819.2020.9301810","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301810","url":null,"abstract":"This paper proposes a new neural network for enhancing underexposed images. Instead of the decomposition method based on Retinex theory, we introduce smooth dilated convolution to estimate global illumination of the input image, and implement an end-to-end learning network model. Based on this model, we formulate a multi-term loss function that combines content, color, texture and smoothness losses. Our extensive experiments demonstrate that this method is superior to other methods in underexposed image enhancement. It can cover more color details and be applied to various underexposed images robustly.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126852375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Dense-Gated U-Net for Brain Lesion Segmentation 一种用于脑损伤分割的密集门控u网
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301852
Zhongyi Ji, Xiao Han, Tong Lin, Wenmin Wang
Brain lesion segmentation plays a crucial role in diagnosis and monitoring of disease progression. DenseNets have been widely used for medical image segmentation, but much redundancy arises in dense-connected feature maps and the training process becomes harder. In this paper, we address the brain lesion segmentation task by proposing a Dense-Gated U-Net (DGNet), which is a hybrid of Dense-gated blocks and U-Net. The main contribution lies in the dense-gated blocks that explicitly model dependencies among concatenated layers and alleviate redundancy. Based on dense-gated blocks, DGNet can achieve weighted concatenation and suppress useless features. Extensive experiments on MICCAI BraTS 2018 challenge and our collected intracranial hemorrhage dataset demonstrate that our approach outperforms a powerful backbone model and other state-of-the-art methods.
脑损伤分割在疾病进展的诊断和监测中起着至关重要的作用。DenseNets已广泛应用于医学图像分割,但在密集连接的特征图中会产生大量冗余,训练过程变得困难。在本文中,我们提出了一个密集门控U-Net (DGNet),它是密集门控块和U-Net的混合。其主要贡献在于密集的封闭块,这些块显式地对连接层之间的依赖关系进行建模,并减轻了冗余。基于密集门控块,DGNet可以实现加权拼接并抑制无用特征。在MICCAI BraTS 2018挑战赛和我们收集的颅内出血数据集上进行的大量实验表明,我们的方法优于强大的骨干模型和其他最先进的方法。
{"title":"A Dense-Gated U-Net for Brain Lesion Segmentation","authors":"Zhongyi Ji, Xiao Han, Tong Lin, Wenmin Wang","doi":"10.1109/VCIP49819.2020.9301852","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301852","url":null,"abstract":"Brain lesion segmentation plays a crucial role in diagnosis and monitoring of disease progression. DenseNets have been widely used for medical image segmentation, but much redundancy arises in dense-connected feature maps and the training process becomes harder. In this paper, we address the brain lesion segmentation task by proposing a Dense-Gated U-Net (DGNet), which is a hybrid of Dense-gated blocks and U-Net. The main contribution lies in the dense-gated blocks that explicitly model dependencies among concatenated layers and alleviate redundancy. Based on dense-gated blocks, DGNet can achieve weighted concatenation and suppress useless features. Extensive experiments on MICCAI BraTS 2018 challenge and our collected intracranial hemorrhage dataset demonstrate that our approach outperforms a powerful backbone model and other state-of-the-art methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126715838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive Resolution Change for Versatile Video Coding 自适应分辨率变化的多功能视频编码
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301762
Tsui-Shan Chang, Yu-Chen Sun, Ling Zhu, J.-G. Lou
This paper presents an adaptive resolution change (ARC) method adopted in versatile video coding (VVC) to adapt the video bit-stream transmission to dynamic network environments. This approach enables resolution changes within a video sequence at any frame without the insertion of an instantaneous decoder refresh (IDR) or intra random access picture (IRAP). The underlying techniques include reference picture resampling and handling of interactions between the existing coding tools and the changes in resolution. In addition to the techniques adopted in VVC, this paper proposes two techniques for temporal motion vector prediction and deblocking filter to further improve both subjective and objective quality. The experimental results show that the combined ARC method can prevent the burden on bit cost exerted by the insertion of an intra frame during resolution changes. At the same time, 18%, 21% and 21% BD-rate reductions are achieved for Y, Cb, and Cr components, respectively.
为了使视频码流传输适应动态网络环境,提出了一种应用于通用视频编码(VVC)的自适应分辨率变化(ARC)方法。这种方法可以在视频序列的任何帧内改变分辨率,而不需要插入即时解码器刷新(IDR)或帧内随机访问图像(IRAP)。其基础技术包括参考图像重采样和处理现有编码工具与分辨率变化之间的相互作用。在VVC中采用的技术基础上,本文提出了时间运动矢量预测和去块滤波两种技术,进一步提高了图像的主客观质量。实验结果表明,该方法可以有效地避免分辨率变化过程中插入帧对码元成本造成的负担。同时,Y、Cb和Cr组分的bd率分别降低了18%、21%和21%。
{"title":"Adaptive Resolution Change for Versatile Video Coding","authors":"Tsui-Shan Chang, Yu-Chen Sun, Ling Zhu, J.-G. Lou","doi":"10.1109/VCIP49819.2020.9301762","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301762","url":null,"abstract":"This paper presents an adaptive resolution change (ARC) method adopted in versatile video coding (VVC) to adapt the video bit-stream transmission to dynamic network environments. This approach enables resolution changes within a video sequence at any frame without the insertion of an instantaneous decoder refresh (IDR) or intra random access picture (IRAP). The underlying techniques include reference picture resampling and handling of interactions between the existing coding tools and the changes in resolution. In addition to the techniques adopted in VVC, this paper proposes two techniques for temporal motion vector prediction and deblocking filter to further improve both subjective and objective quality. The experimental results show that the combined ARC method can prevent the burden on bit cost exerted by the insertion of an intra frame during resolution changes. At the same time, 18%, 21% and 21% BD-rate reductions are achieved for Y, Cb, and Cr components, respectively.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127958452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
News Image Steganography: A Novel Architecture Facilitates the Fake News Identification 新闻图像隐写术:一种促进假新闻识别的新架构
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301846
Jizhe Zhou, Chi-Man Pun, Yu Tong
A larger portion of fake news quotes untampered images from other sources with ulterior motives rather than conducting image forgery. Such elaborate engraftments keep the inconsistency between images and text reports stealthy, thereby, palm off the spurious for the genuine. This paper proposes an architecture named News Image Steganography (NIS) to reveal the aforementioned inconsistency through image steganography based on GAN. Extractive summarization about a news image is generated based on its source texts, and a learned steganographic algorithm encodes and decodes the summarization of the image in a manner that approaches perceptual invisibility. Once an encoded image is quoted, its source summarization can be decoded and further presented as the ground truth to verify the quoting news. The pairwise encoder and decoder endow images of the capability to carry along their imperceptible summarization. Our NIS reveals the underlying inconsistency, thereby, according to our experiments and investigations, contributes to the identification accuracy of fake news that engrafts untampered images.
更大一部分假新闻是别有用心地引用其他来源的未经篡改的图像,而不是进行图像伪造。这种精心设计的植入使图像和文本报告之间的不一致变得隐秘,从而将虚假的报告变成真实的报告。本文提出了一种新闻图像隐写(NIS)架构,通过基于GAN的图像隐写来揭示上述不一致性。基于源文本生成关于新闻图像的提取摘要,并且学习的隐写算法以接近感知不可见的方式对图像的摘要进行编码和解码。一旦一个编码的图像被引用,它的来源摘要可以被解码,并进一步作为基础事实来验证引用新闻。编码器和解码器的配对赋予图像的能力进行他们的难以察觉的总结。我们的NIS揭示了潜在的不一致性,因此,根据我们的实验和调查,有助于识别植入未篡改图像的假新闻的准确性。
{"title":"News Image Steganography: A Novel Architecture Facilitates the Fake News Identification","authors":"Jizhe Zhou, Chi-Man Pun, Yu Tong","doi":"10.1109/VCIP49819.2020.9301846","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301846","url":null,"abstract":"A larger portion of fake news quotes untampered images from other sources with ulterior motives rather than conducting image forgery. Such elaborate engraftments keep the inconsistency between images and text reports stealthy, thereby, palm off the spurious for the genuine. This paper proposes an architecture named News Image Steganography (NIS) to reveal the aforementioned inconsistency through image steganography based on GAN. Extractive summarization about a news image is generated based on its source texts, and a learned steganographic algorithm encodes and decodes the summarization of the image in a manner that approaches perceptual invisibility. Once an encoded image is quoted, its source summarization can be decoded and further presented as the ground truth to verify the quoting news. The pairwise encoder and decoder endow images of the capability to carry along their imperceptible summarization. Our NIS reveals the underlying inconsistency, thereby, according to our experiments and investigations, contributes to the identification accuracy of fake news that engrafts untampered images.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127554923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NIR image colorization with graph-convolutional neural networks 基于图卷积神经网络的近红外图像着色
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301839
D. Valsesia, Giulia Fracastoro, E. Magli
Colorization of near-infrared (NIR) images is a challenging problem due to the different material properties at the infared wavelenghts, thus reducing the correlation with visible images. In this paper, we study how graph-convolutional neural networks allow exploiting a more powerful inductive bias than standard CNNs, in the form of non-local self-similiarity. Its impact is evaluated by showing how training with mean squared error only as loss leads to poor results with a standard CNN, while the graph-convolutional network produces significantly sharper and more realistic colorizations.
近红外(NIR)图像的着色是一个具有挑战性的问题,因为在红外波长下材料的特性不同,从而降低了与可见光图像的相关性。在本文中,我们研究了图卷积神经网络如何以非局部自相似性的形式利用比标准cnn更强大的归纳偏差。它的影响是通过显示仅以均方误差作为损失的训练如何导致标准CNN的糟糕结果来评估的,而图卷积网络产生了更清晰、更逼真的着色。
{"title":"NIR image colorization with graph-convolutional neural networks","authors":"D. Valsesia, Giulia Fracastoro, E. Magli","doi":"10.1109/VCIP49819.2020.9301839","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301839","url":null,"abstract":"Colorization of near-infrared (NIR) images is a challenging problem due to the different material properties at the infared wavelenghts, thus reducing the correlation with visible images. In this paper, we study how graph-convolutional neural networks allow exploiting a more powerful inductive bias than standard CNNs, in the form of non-local self-similiarity. Its impact is evaluated by showing how training with mean squared error only as loss leads to poor results with a standard CNN, while the graph-convolutional network produces significantly sharper and more realistic colorizations.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133747944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Low Resolution Facial Manipulation Detection 低分辨率面部操作检测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301796
Xiao Han, Zhongyi Ji, Wenmin Wang
Detecting manipulated images and videos is an important aspect of digital media forensics. Due to severe discriminative information loss caused by resolution degradation, the performance of most existing methods is significantly reduced on low resolution manipulated images. To address this issue, we propose an Artifacts-Focus Super-Resolution (AFSR) module and a Two-stream Feature Extractor (TFE). The AFSR recovers facial cues and manipulation artifact details using an autoencoder learned with an artifacts focus training loss. The TFE adopts a two-stream feature extractor with key points-based fusion pooling to learn discriminative facial representations. These two complementary modules are jointly trained to recover and capture distinctive manipulation artifacts in low resolution images. Extensive experiments on two benchmarks including FaceForensics++ and DeepfakeTIMIT, evidence the favorable performance of our method against other state-of-the-art methods.
检测被篡改的图像和视频是数字媒体取证的一个重要方面。由于分辨率下降导致的严重的判别信息损失,大多数现有方法在低分辨率操纵图像上的性能显著降低。为了解决这个问题,我们提出了一个伪像聚焦超分辨率(AFSR)模块和一个双流特征提取器(TFE)。AFSR使用自编码器和伪影焦点训练损失来恢复面部线索和操纵伪影细节。TFE采用基于关键点融合池的两流特征提取器学习判别性面部表征。这两个互补的模块被联合训练,以恢复和捕获低分辨率图像中独特的操作工件。在包括face取证++和DeepfakeTIMIT在内的两个基准测试上进行了大量实验,证明我们的方法与其他最先进的方法相比具有良好的性能。
{"title":"Low Resolution Facial Manipulation Detection","authors":"Xiao Han, Zhongyi Ji, Wenmin Wang","doi":"10.1109/VCIP49819.2020.9301796","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301796","url":null,"abstract":"Detecting manipulated images and videos is an important aspect of digital media forensics. Due to severe discriminative information loss caused by resolution degradation, the performance of most existing methods is significantly reduced on low resolution manipulated images. To address this issue, we propose an Artifacts-Focus Super-Resolution (AFSR) module and a Two-stream Feature Extractor (TFE). The AFSR recovers facial cues and manipulation artifact details using an autoencoder learned with an artifacts focus training loss. The TFE adopts a two-stream feature extractor with key points-based fusion pooling to learn discriminative facial representations. These two complementary modules are jointly trained to recover and capture distinctive manipulation artifacts in low resolution images. Extensive experiments on two benchmarks including FaceForensics++ and DeepfakeTIMIT, evidence the favorable performance of our method against other state-of-the-art methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117008684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Orthogonal Coded Multi-view Structured Light for Inter-view Interference Elimination 正交编码多视点结构光消除视点间干扰
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301891
Zaichao Sun, G. Qian, Zhaoyu Peng, Weiju Dai, Dongjun Sun, Gongyuan Zhang, Nongtao Zhang, Jun Xu, Ren Wang, Chunlin Li
Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been proposed to obtain the depth information of an object in 3D world, and most of them are based on spatial codification. A brutal truth is that two or more cameras and projectors from different viewpoints are needed to measure the dynamic scene simultaneously for rapid 3D reconstruction. However, when two traditional patterns, especially the binaries, are mutually overlapped, interference between them arises to a new challenge to 3D reconstruction. Traditional patterns can hardly be separated from each other, which surely influence the quality of the 3D reconstruction. To eliminate the interference problem, we propose a scheme of orthogonal coded multi-view structured light systems, which can obtain accurate of depth maps for a scene. Besides, we also test the stability of the orthogonal patterns by establishing three different scenes and making a comparisons to traditional patterns. New state-of-the-art results can be obtained by our scheme in the experiments.
动态场景的快速三维重建在三维物体结构分析、无人机事故避免和其他视觉应用中非常有用。针对动态场景,人们提出了编码结构光方法来获取三维世界中物体的深度信息,这些方法大多是基于空间编码的。一个残酷的事实是,为了快速的3D重建,需要两个或更多的摄像机和投影仪从不同的角度同时测量动态场景。然而,当两种传统图案,特别是二进制图案相互重叠时,它们之间的干扰给三维重建带来了新的挑战。传统的图案很难相互分离,这必然会影响到三维重建的质量。为了消除干扰问题,我们提出了一种正交编码多视点结构光系统方案,该方案可以获得精确的场景深度图。此外,我们还通过建立三种不同场景,并与传统模式进行对比,来检验正交模式的稳定性。该方案在实验中可以得到新的、最先进的结果。
{"title":"Orthogonal Coded Multi-view Structured Light for Inter-view Interference Elimination","authors":"Zaichao Sun, G. Qian, Zhaoyu Peng, Weiju Dai, Dongjun Sun, Gongyuan Zhang, Nongtao Zhang, Jun Xu, Ren Wang, Chunlin Li","doi":"10.1109/VCIP49819.2020.9301891","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301891","url":null,"abstract":"Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been proposed to obtain the depth information of an object in 3D world, and most of them are based on spatial codification. A brutal truth is that two or more cameras and projectors from different viewpoints are needed to measure the dynamic scene simultaneously for rapid 3D reconstruction. However, when two traditional patterns, especially the binaries, are mutually overlapped, interference between them arises to a new challenge to 3D reconstruction. Traditional patterns can hardly be separated from each other, which surely influence the quality of the 3D reconstruction. To eliminate the interference problem, we propose a scheme of orthogonal coded multi-view structured light systems, which can obtain accurate of depth maps for a scene. Besides, we also test the stability of the orthogonal patterns by establishing three different scenes and making a comparisons to traditional patterns. New state-of-the-art results can be obtained by our scheme in the experiments.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131689406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1