首页 > 最新文献

2021 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
A Hyperspectral Approach For Unsupervised Spoof Detection With Intra-Sample Distribution 基于样本内分布的无监督欺骗检测的高光谱方法
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506625
Tomoya Kaichi, Yuko Ozasa
Despite the high recognition accuracy of recent deep neural networks, they can be easily deceived by spoofing. Spoofs (e.g., a printed photograph) visually resemble the actual objects quite closely. Thus, we propose a method for spoof detection with a hyperspectral image (HSI) that can effectively detect differences in surface materials. In contrast to existing anti-spoofing approaches, the proposed method learns the feature representation for spoof detection without spoof supervision. The informative pixels on an HSI are embedded onto the feature space, and we identify the spoof from their distribution. As this is the first attempt at unsupervised spoof detection with an HSI, a new dataset that includes spoofs, named Hyperspectral Spoof Dataset (HSSD), has been developed. The experimental results indicate that the proposed method performs significantly better than the baselines. The source code and the dataset are available on Github1.
尽管近年来深度神经网络的识别精度很高,但它们很容易被欺骗。欺骗(例如,印刷的照片)在视觉上与实际物体非常相似。因此,我们提出了一种利用高光谱图像(HSI)有效检测表面材料差异的欺骗检测方法。与现有的防欺骗方法相比,该方法在没有欺骗监督的情况下学习欺骗检测的特征表示。HSI上的信息像素被嵌入到特征空间中,我们从它们的分布中识别欺骗。由于这是使用HSI进行无监督欺骗检测的第一次尝试,因此开发了一个包含欺骗的新数据集,称为高光谱欺骗数据集(HSSD)。实验结果表明,该方法的性能明显优于基线方法。源代码和数据集可以在Github1上获得。
{"title":"A Hyperspectral Approach For Unsupervised Spoof Detection With Intra-Sample Distribution","authors":"Tomoya Kaichi, Yuko Ozasa","doi":"10.1109/ICIP42928.2021.9506625","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506625","url":null,"abstract":"Despite the high recognition accuracy of recent deep neural networks, they can be easily deceived by spoofing. Spoofs (e.g., a printed photograph) visually resemble the actual objects quite closely. Thus, we propose a method for spoof detection with a hyperspectral image (HSI) that can effectively detect differences in surface materials. In contrast to existing anti-spoofing approaches, the proposed method learns the feature representation for spoof detection without spoof supervision. The informative pixels on an HSI are embedded onto the feature space, and we identify the spoof from their distribution. As this is the first attempt at unsupervised spoof detection with an HSI, a new dataset that includes spoofs, named Hyperspectral Spoof Dataset (HSSD), has been developed. The experimental results indicate that the proposed method performs significantly better than the baselines. The source code and the dataset are available on Github1.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"54 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129988139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Image-Level Iris Morph Attack 图像级虹膜变形攻击
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506802
Renu Sharma, A. Ross
We investigate the problem of morph attacks in the context of iris biometrics. A morph attack entails the generation of an image that embodies two different identities. This is accomplished by combining, i.e., morphing, two biometric samples pertaining to two different identities. While such an attack is being increasingly studied in the context of face recognition, it has not been widely analyzed in iris recognition. In this work, we perform iris morphing at the image-level and generate morphed iris images using two available datasets (IITD and WVU multi-modal). We demonstrate the vulnerability of three different iris recognition methods to morph attacks with a success rate of over 90% at a false match rate of 0.01%. We also analyze the textural similarity required between the component images to create a successful morphed image. Finally, we provide preliminary results on the detection of morphed iris images.
我们研究了虹膜生物识别背景下的变形攻击问题。变形攻击需要生成一个包含两个不同身份的图像。这是通过组合,即变形,属于两个不同身份的两个生物特征样本来完成的。尽管这种攻击在人脸识别领域的研究越来越多,但在虹膜识别领域的研究还不够广泛。在这项工作中,我们在图像级执行虹膜变形,并使用两个可用的数据集(IITD和WVU多模态)生成变形的虹膜图像。我们展示了三种不同的虹膜识别方法对变形攻击的脆弱性,在0.01%的假匹配率下成功率超过90%。我们还分析了组件图像之间的纹理相似性,以创建成功的变形图像。最后,我们提供了变形虹膜图像检测的初步结果。
{"title":"Image-Level Iris Morph Attack","authors":"Renu Sharma, A. Ross","doi":"10.1109/ICIP42928.2021.9506802","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506802","url":null,"abstract":"We investigate the problem of morph attacks in the context of iris biometrics. A morph attack entails the generation of an image that embodies two different identities. This is accomplished by combining, i.e., morphing, two biometric samples pertaining to two different identities. While such an attack is being increasingly studied in the context of face recognition, it has not been widely analyzed in iris recognition. In this work, we perform iris morphing at the image-level and generate morphed iris images using two available datasets (IITD and WVU multi-modal). We demonstrate the vulnerability of three different iris recognition methods to morph attacks with a success rate of over 90% at a false match rate of 0.01%. We also analyze the textural similarity required between the component images to create a successful morphed image. Finally, we provide preliminary results on the detection of morphed iris images.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133894621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Exploiting Facial Symmetry to Expose Deepfakes 利用面部对称暴露深度造假
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506272
Gen Li, Yun Cao, Xianfeng Zhao
In this paper, we introduce a new approach to detect synthetic portrait images and videos. Motivated by the observation that the symmetry of synthetic facial area would be easily broken, this approach aims to reveal the tampering trace by features learned from symmetrical facial regions. To do so, a two-stream learning framework is designed which uses a hard sharing Deep Residual Networks as the backbone network. The feature extractor maps the pair of symmetrical face patches to an angular distance indicating the difference of symmetry features. Extensive experiments are carried out to test the effectiveness in detecting synthetic portrait images and videos, and corresponding results show that our approach is effective even on heterogeneous data and re-compression data that were not used to train the detection model.
本文介绍了一种检测合成人像图像和视频的新方法。由于观察到合成面部区域的对称性容易被破坏,该方法旨在通过从对称面部区域学习到的特征来揭示篡改痕迹。为此,设计了一种以硬共享深度残差网络为骨干网络的双流学习框架。特征提取器将这对对称的人脸块映射到一个表示对称特征差异的角距离上。我们进行了大量的实验来测试检测合成人像图像和视频的有效性,相应的结果表明,我们的方法即使在不用于训练检测模型的异构数据和再压缩数据上也是有效的。
{"title":"Exploiting Facial Symmetry to Expose Deepfakes","authors":"Gen Li, Yun Cao, Xianfeng Zhao","doi":"10.1109/ICIP42928.2021.9506272","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506272","url":null,"abstract":"In this paper, we introduce a new approach to detect synthetic portrait images and videos. Motivated by the observation that the symmetry of synthetic facial area would be easily broken, this approach aims to reveal the tampering trace by features learned from symmetrical facial regions. To do so, a two-stream learning framework is designed which uses a hard sharing Deep Residual Networks as the backbone network. The feature extractor maps the pair of symmetrical face patches to an angular distance indicating the difference of symmetry features. Extensive experiments are carried out to test the effectiveness in detecting synthetic portrait images and videos, and corresponding results show that our approach is effective even on heterogeneous data and re-compression data that were not used to train the detection model.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131797188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Zoomable Intra Prediction for Multi-Focus Plenoptic 2.0 Video Coding 多焦点全光学2.0视频编码的可缩放内预测
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506363
Fan Jiang, Xin Jin, Tingting Zhong
Plenoptic 2.0 videos that record time-varying light fields by focused plenoptic cameras are promising to immersive visual applications because of capturing dense sampled light fields with high spatial resolution in the rendered sub-apertures. In this paper, an intra prediction method is proposed for compressing multi-focus plenoptic 2.0 videos efficiently. Based on the imaging principle analysis of multi-focus plenoptic cameras, zooming relationships among the microimages are discovered and exploited by the proposed method. Positions of the prediction candidates and the zooming factors are derived, after which block zooming and tailoring are proposed to generate novel prediction candidates for weighted prediction. Experimental results demonstrated the superior performance of the proposed method relative to HEVC and state-of-the-art methods.
通过聚焦全光相机记录时变光场的Plenoptic 2.0视频有望用于沉浸式视觉应用,因为它可以在渲染的子孔径中以高空间分辨率捕获密集采样光场。本文提出了一种有效压缩多焦全光学2.0视频的帧内预测方法。在分析多焦全光相机成像原理的基础上,发现并利用了微像之间的变焦关系。在此基础上,提出了分块缩放和裁剪的方法,生成新的预测候选者进行加权预测。实验结果表明,该方法相对于HEVC和现有方法具有优越的性能。
{"title":"Zoomable Intra Prediction for Multi-Focus Plenoptic 2.0 Video Coding","authors":"Fan Jiang, Xin Jin, Tingting Zhong","doi":"10.1109/ICIP42928.2021.9506363","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506363","url":null,"abstract":"Plenoptic 2.0 videos that record time-varying light fields by focused plenoptic cameras are promising to immersive visual applications because of capturing dense sampled light fields with high spatial resolution in the rendered sub-apertures. In this paper, an intra prediction method is proposed for compressing multi-focus plenoptic 2.0 videos efficiently. Based on the imaging principle analysis of multi-focus plenoptic cameras, zooming relationships among the microimages are discovered and exploited by the proposed method. Positions of the prediction candidates and the zooming factors are derived, after which block zooming and tailoring are proposed to generate novel prediction candidates for weighted prediction. Experimental results demonstrated the superior performance of the proposed method relative to HEVC and state-of-the-art methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"24 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130764535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Cryo-Electron Microscopy Image Denoising Using Multi-Frequency Vector Diffusion Maps 低温电子显微镜图像的多频矢量扩散图去噪
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506435
Yifeng Fan, Zhizhen Zhao
Cryo-electron microscopy (EM) single particle reconstruction is a general technique for 3D structure determination of macromolecules. However, because the images are taken at low electron dose, it is extremely hard to visualize the individual particle with low contrast and high noise level. In this paper, we propose a novel framework for cryo-EM single particle image denoising, which incorporates the recently developed multi-frequency vector diffusion maps [1] for improving the identification and alignment of images with similar viewing directions. In addition, we propose a novel filtering scheme combining graph signal processing and truncated Fourier-Bessel expansion of the projection images. Through both simulated and publicly available real data, we demonstrate that our proposed method is efficient and robust to noise compared with the state-of-the-art cryo-EM 2D class averaging algorithms.
低温电子显微镜(EM)单粒子重建是测定大分子三维结构的一种通用技术。然而,由于图像是在低电子剂量下拍摄的,因此在低对比度和高噪声水平下,很难将单个粒子可视化。在本文中,我们提出了一种新的冷冻电镜单粒子图像去噪框架,该框架结合了最近发展的多频矢量扩散图[1],以改善具有相似观看方向的图像的识别和对准。此外,我们提出了一种结合图信号处理和投影图像截断傅立叶-贝塞尔展开式的滤波方案。通过模拟和公开的真实数据,我们证明了与最先进的cryo-EM 2D类平均算法相比,我们提出的方法是有效的,对噪声具有鲁棒性。
{"title":"Cryo-Electron Microscopy Image Denoising Using Multi-Frequency Vector Diffusion Maps","authors":"Yifeng Fan, Zhizhen Zhao","doi":"10.1109/ICIP42928.2021.9506435","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506435","url":null,"abstract":"Cryo-electron microscopy (EM) single particle reconstruction is a general technique for 3D structure determination of macromolecules. However, because the images are taken at low electron dose, it is extremely hard to visualize the individual particle with low contrast and high noise level. In this paper, we propose a novel framework for cryo-EM single particle image denoising, which incorporates the recently developed multi-frequency vector diffusion maps [1] for improving the identification and alignment of images with similar viewing directions. In addition, we propose a novel filtering scheme combining graph signal processing and truncated Fourier-Bessel expansion of the projection images. Through both simulated and publicly available real data, we demonstrate that our proposed method is efficient and robust to noise compared with the state-of-the-art cryo-EM 2D class averaging algorithms.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133590024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Enhanced Back Projection Network Based Stereo Image Super-Resolution Considering Parallax Attention 考虑视差注意的基于增强背投影网络的立体图像超分辨率
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506412
Li Ma, Sumei Li
Recent years have witnessed great advances in stereo image super-resolution (SR). However, the existing methods only consider the horizontal parallax when capturing the stereo correspondence, which is insufficient because the vertical parallax inevitably exists in stereo image pairs. To address this problem, we propose an enhanced back projection stereo SR network (EBPSSRnet) to make full use of the complementary information in stereo images for more accurate SR results. Specifically, we propose a relaxed parallax attention module (rePAM) to handle different stereo images with vertical and horizontal parallax. Then, an enhanced back projection block (EBPB) is developed to extract discriminative features for capturing the stereo correspondence and consolidate the best representation for reconstruction. Extensive experiments show that the proposed method achieves state-of-the-art performance on the Flickr1024, Middlebury, KITTI2012 and KITTI2015 datasets.
近年来,立体图像超分辨率(SR)技术取得了巨大的进步。然而,现有的方法在捕获立体对应时只考虑水平视差,由于垂直视差在立体图像对中不可避免地存在,这是不够的。为了解决这一问题,我们提出了一种增强的反向投影立体SR网络(EBPSSRnet),以充分利用立体图像中的互补信息,获得更准确的SR结果。具体来说,我们提出了一个放松视差注意模块(rePAM)来处理不同的垂直视差和水平视差立体图像。然后,提出了一种增强的反向投影块(EBPB)来提取判别特征,以捕获立体对应,并巩固最佳表示进行重建。大量实验表明,该方法在Flickr1024、Middlebury、KITTI2012和KITTI2015数据集上达到了最先进的性能。
{"title":"Enhanced Back Projection Network Based Stereo Image Super-Resolution Considering Parallax Attention","authors":"Li Ma, Sumei Li","doi":"10.1109/ICIP42928.2021.9506412","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506412","url":null,"abstract":"Recent years have witnessed great advances in stereo image super-resolution (SR). However, the existing methods only consider the horizontal parallax when capturing the stereo correspondence, which is insufficient because the vertical parallax inevitably exists in stereo image pairs. To address this problem, we propose an enhanced back projection stereo SR network (EBPSSRnet) to make full use of the complementary information in stereo images for more accurate SR results. Specifically, we propose a relaxed parallax attention module (rePAM) to handle different stereo images with vertical and horizontal parallax. Then, an enhanced back projection block (EBPB) is developed to extract discriminative features for capturing the stereo correspondence and consolidate the best representation for reconstruction. Extensive experiments show that the proposed method achieves state-of-the-art performance on the Flickr1024, Middlebury, KITTI2012 and KITTI2015 datasets.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133461277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Action Quality Assessment With Ignoring Scene Context 忽略场景背景的动作质量评估
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506257
Takasuke Nagai, Shoichiro Takeda, Masaaki Matsumura, S. Shimizu, Susumu Yamamoto
We propose an action quality assessment (AQA) method that can specifically assess target action quality with ignoring scene context, which is a feature unrelated to the target action. Existing AQA methods have tried to extract spatiotemporal features related to the target action by applying 3D convolution to the video. However, since their models are not explicitly designed to extract the features of the target action, they mis-extract scene context and thus cannot assess the target action quality correctly. To overcome this problem, we impose two losses to an existing AQA model: scene adversarial loss and our newly proposed human-masked regression loss. The scene adversarial loss encourages the model to ignore scene context by adversarial training. Our human-masked regression loss does so by making the correlation between score outputs by an AQA model and human referees undefinable when the target action is not visible. These two losses lead the model to specifically assess the target action quality with ignoring scene context. We evaluated our method on a diving dataset commonly used for AQA and found that it outperformed current state-of-the-art methods. This result shows that our method is effective in ignoring scene context while assessing the target action quality.
我们提出了一种动作质量评估(AQA)方法,该方法可以在忽略与目标动作无关的场景上下文的情况下,专门评估目标动作质量。现有的AQA方法试图通过对视频进行三维卷积来提取与目标动作相关的时空特征。然而,由于他们的模型没有明确设计来提取目标动作的特征,他们错误地提取场景上下文,因此无法正确评估目标动作的质量。为了克服这个问题,我们对现有的AQA模型施加了两种损失:场景对抗损失和我们新提出的人类掩蔽回归损失。场景对抗性损失鼓励模型通过对抗性训练忽略场景上下文。当目标动作不可见时,我们的人为屏蔽回归损失是通过使AQA模型和人类裁判的得分输出之间的相关性不可定义来实现的。这两种损失导致模型在忽略场景上下文的情况下专门评估目标动作质量。我们在AQA常用的潜水数据集上评估了我们的方法,发现它优于当前最先进的方法。结果表明,我们的方法在评估目标动作质量时可以有效地忽略场景上下文。
{"title":"Action Quality Assessment With Ignoring Scene Context","authors":"Takasuke Nagai, Shoichiro Takeda, Masaaki Matsumura, S. Shimizu, Susumu Yamamoto","doi":"10.1109/ICIP42928.2021.9506257","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506257","url":null,"abstract":"We propose an action quality assessment (AQA) method that can specifically assess target action quality with ignoring scene context, which is a feature unrelated to the target action. Existing AQA methods have tried to extract spatiotemporal features related to the target action by applying 3D convolution to the video. However, since their models are not explicitly designed to extract the features of the target action, they mis-extract scene context and thus cannot assess the target action quality correctly. To overcome this problem, we impose two losses to an existing AQA model: scene adversarial loss and our newly proposed human-masked regression loss. The scene adversarial loss encourages the model to ignore scene context by adversarial training. Our human-masked regression loss does so by making the correlation between score outputs by an AQA model and human referees undefinable when the target action is not visible. These two losses lead the model to specifically assess the target action quality with ignoring scene context. We evaluated our method on a diving dataset commonly used for AQA and found that it outperformed current state-of-the-art methods. This result shows that our method is effective in ignoring scene context while assessing the target action quality.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133647024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Quality Assessment of Screen Content Images Based on Convolutional Neural Network with Dual Pathways 基于双路径卷积神经网络的屏幕内容图像质量评估
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506707
Yongli Chang, Sumei Li, Anqi Liu
To simulate the characteristics of perceiving things from binocular vision, a dual-pathway convolutional neural network (CNN) for quality assessment of screen content images (SCIs) is proposed. Considering the different sensitivity of retinal photoreceptor cells to RGB colors and the human visual attention mechanism, we employ a convolutional block attention module (CBAM) to weight the RGB channels and their spatial position on each channel. And 3D convolution considering inter-frame information is used to extract the correlation features between RGB channels. Moreover, because of the important role of optic chiasm in binocular vision, we design its simulation strategy in the proposed network. Furthermore, since the characteristics of multi-scale and multi-level are indispensable to perception of any objects in human visual system (HVS), a new multi-scale and multi-level feature fusion (MSMLFF) module is built to obtain perceptual features of different scales and levels. Experimental results show that the proposed method is superior to several mainstream SCIs metrics on publicly accessible databases.
为了模拟双眼视觉感知事物的特征,提出了一种用于屏幕内容图像质量评估的双通道卷积神经网络(CNN)。考虑到视网膜感光细胞对RGB颜色的不同敏感性和人类视觉注意机制,我们采用卷积块注意模块(CBAM)对RGB通道及其在每个通道上的空间位置进行加权。采用考虑帧间信息的三维卷积提取RGB通道间的相关特征。此外,由于视交叉在双目视觉中的重要作用,我们在所提出的网络中设计了其仿真策略。此外,针对人类视觉系统对任何物体的感知都离不开多尺度和多层次的特征,构建了一种新的多尺度和多层次特征融合(MSMLFF)模块,以获取不同尺度和层次的感知特征。实验结果表明,该方法在公共访问数据库上优于几种主流的SCIs指标。
{"title":"Quality Assessment of Screen Content Images Based on Convolutional Neural Network with Dual Pathways","authors":"Yongli Chang, Sumei Li, Anqi Liu","doi":"10.1109/ICIP42928.2021.9506707","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506707","url":null,"abstract":"To simulate the characteristics of perceiving things from binocular vision, a dual-pathway convolutional neural network (CNN) for quality assessment of screen content images (SCIs) is proposed. Considering the different sensitivity of retinal photoreceptor cells to RGB colors and the human visual attention mechanism, we employ a convolutional block attention module (CBAM) to weight the RGB channels and their spatial position on each channel. And 3D convolution considering inter-frame information is used to extract the correlation features between RGB channels. Moreover, because of the important role of optic chiasm in binocular vision, we design its simulation strategy in the proposed network. Furthermore, since the characteristics of multi-scale and multi-level are indispensable to perception of any objects in human visual system (HVS), a new multi-scale and multi-level feature fusion (MSMLFF) module is built to obtain perceptual features of different scales and levels. Experimental results show that the proposed method is superior to several mainstream SCIs metrics on publicly accessible databases.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132756992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robustness of Time-Resolved Measurement to Unknown and Variable Beam Current in Particle Beam Microscopy 粒子束显微镜中时间分辨测量对未知和可变光束电流的鲁棒性
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506340
Luisa Watkins, Sheila W. Seidel, Minxu Peng, Akshay Agarwal, Christopher C. Yu, V. Goyal
Variations in the intensity of the incident beam can cause significant inaccuracies in microscopes that use focused beams of electrons or ions. Existing mitigation methods depend on the artifacts having characteristic spatial structures explained by the raster scan pattern and temporal correlation of the beam current variations. We show that recently introduced time-resolved measurement methods create robustness to beam current variations that improve significantly upon existing methods while not depending on separability of artifact structure from underlying image content. These advantages are illustrated through Monte Carlo simulations representative of both helium ion microscopy (higher secondary electron yield) and scanning electron microscopy (lower secondary electron yield). Notably, this demonstrates that when the beam current variation is appreciable, time-resolved measurements provide a novel benefit in particle beam microscopy that extends to low secondary electron yields.
在使用电子或离子聚焦光束的显微镜中,入射光束强度的变化会导致显著的不准确性。现有的减缓方法依赖于具有由光栅扫描图和波束电流变化的时间相关性解释的特征空间结构的人工制品。我们表明,最近引入的时间分辨测量方法对光束电流变化具有鲁棒性,在现有方法的基础上显著改进,同时不依赖于伪影结构与底层图像内容的可分离性。这些优点是通过蒙特卡罗模拟氦离子显微镜(较高的二次电子产率)和扫描电子显微镜(较低的二次电子产率)。值得注意的是,这表明,当光束电流变化是明显的,时间分辨的测量提供了一个新的好处,在粒子束显微镜延伸到低二次电子产率。
{"title":"Robustness of Time-Resolved Measurement to Unknown and Variable Beam Current in Particle Beam Microscopy","authors":"Luisa Watkins, Sheila W. Seidel, Minxu Peng, Akshay Agarwal, Christopher C. Yu, V. Goyal","doi":"10.1109/ICIP42928.2021.9506340","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506340","url":null,"abstract":"Variations in the intensity of the incident beam can cause significant inaccuracies in microscopes that use focused beams of electrons or ions. Existing mitigation methods depend on the artifacts having characteristic spatial structures explained by the raster scan pattern and temporal correlation of the beam current variations. We show that recently introduced time-resolved measurement methods create robustness to beam current variations that improve significantly upon existing methods while not depending on separability of artifact structure from underlying image content. These advantages are illustrated through Monte Carlo simulations representative of both helium ion microscopy (higher secondary electron yield) and scanning electron microscopy (lower secondary electron yield). Notably, this demonstrates that when the beam current variation is appreciable, time-resolved measurements provide a novel benefit in particle beam microscopy that extends to low secondary electron yields.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133176733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Inter-Modality Fusion Based Attention for Zero-Shot Cross-Modal Retrieval 基于多模态融合的零弹跨模态检索
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506182
Bela Chakraborty, Peng Wang, Lei Wang
Zero-shot cross-modal retrieval (ZS-CMR) performs the task of cross-modal retrieval where the classes of test categories have a different scope than the training categories. It borrows the intuition from zero-shot learning which targets to transfer the knowledge inferred during the training phase for seen classes to the testing phase for unseen classes. It mimics the real-world scenario where new object categories are continuously populating the multi-media data corpus. Unlike existing ZS-CMR approaches which use generative adversarial networks (GANs) to generate more data, we propose Inter-Modality Fusion based Attention (IMFA) and a framework ZS_INN_FUSE (Zero-Shot cross-modal retrieval using INNer product with image-text FUSEd). It exploits the rich semantics of textual data as guidance to infer additional knowledge during the training phase. This is achieved by generating attention weights through the fusion of image and text modalities to focus on the important regions in an image. We carefully create a zero-shot split based on the large-scale MS-COCO and Flickr30k datasets to perform experiments. The results show that our method achieves improvement over the ZS-CMR baseline and self-attention mechanism, demonstrating the effectiveness of inter-modality fusion in a zero-shot scenario.
零射击跨模态检索(Zero-shot cross-modal retrieval, ZS-CMR)在测试类别的类与训练类别具有不同范围的情况下执行跨模态检索任务。它借鉴了零射击学习的直觉,目标是将在训练阶段对可见类推断的知识转移到未见类的测试阶段。它模拟了现实世界的场景,其中新的对象类别不断填充多媒体数据语料库。与现有的使用生成对抗网络(gan)生成更多数据的ZS-CMR方法不同,我们提出了基于跨模态融合的注意力(IMFA)和框架ZS_INN_FUSE(使用图像-文本融合的内部产品进行零shot跨模态检索)。它利用文本数据的丰富语义作为指导,在训练阶段推断额外的知识。这是通过融合图像和文本模式来产生关注权重来实现的,以关注图像中的重要区域。我们在MS-COCO和Flickr30k大规模数据集的基础上,精心创建了一个零镜头分割来进行实验。结果表明,该方法对ZS-CMR基线和自注意机制进行了改进,证明了零弹场景下多模态融合的有效性。
{"title":"Inter-Modality Fusion Based Attention for Zero-Shot Cross-Modal Retrieval","authors":"Bela Chakraborty, Peng Wang, Lei Wang","doi":"10.1109/ICIP42928.2021.9506182","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506182","url":null,"abstract":"Zero-shot cross-modal retrieval (ZS-CMR) performs the task of cross-modal retrieval where the classes of test categories have a different scope than the training categories. It borrows the intuition from zero-shot learning which targets to transfer the knowledge inferred during the training phase for seen classes to the testing phase for unseen classes. It mimics the real-world scenario where new object categories are continuously populating the multi-media data corpus. Unlike existing ZS-CMR approaches which use generative adversarial networks (GANs) to generate more data, we propose Inter-Modality Fusion based Attention (IMFA) and a framework ZS_INN_FUSE (Zero-Shot cross-modal retrieval using INNer product with image-text FUSEd). It exploits the rich semantics of textual data as guidance to infer additional knowledge during the training phase. This is achieved by generating attention weights through the fusion of image and text modalities to focus on the important regions in an image. We carefully create a zero-shot split based on the large-scale MS-COCO and Flickr30k datasets to perform experiments. The results show that our method achieves improvement over the ZS-CMR baseline and self-attention mechanism, demonstrating the effectiveness of inter-modality fusion in a zero-shot scenario.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131274902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1