首页 > 最新文献

2021 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
UTR: Unsupervised Learning of Thickness-Insensitive Representations for Electron Microscope Image 电子显微镜图像厚度不敏感表征的无监督学习
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506597
Tong Xin, Bohao Chen, Xi Chen, Hua Han
Registration of serial section electron microscopy (ssEM) images is essential for neural circuit reconstruction. Morphologies of neurite structure in adjacent sections are different. Thus, it is challenging to extract valid features in ssEM image registration. Convolutional neural networks (CNN) have made unprecedented progress in feature extraction of natural images. However, morphological differences need not be considered in the registration of natural images. Directly applying these methods will result in matching failure or over-registration. This paper proposes an unsupervised learning-based representation taking the morphological differences of ssEM images into account. CNN architecture was used to extract the feature. To train the network, the focused ion beam scanning electron microscope (FIB-SEM) images are used. The FIB-SEM images are in situ, so they are naturally registered. Sampling those images with a certain thickness can teach CNN to learn changes in neurite structure. The learned feature can be directly applied to existing ssEM image registration methods and reduce the negative effect of section thickness on registration accuracy. The experimental results show that the proposed feature outperforms the state-of-the-art method in matching accuracy and significantly improves the registration outcome when used in ssEM images.
序列断层电子显微镜(ssEM)图像的配准是神经回路重建的关键。相邻切片的神经突结构形态不同。因此,在ssEM图像配准中提取有效的特征是一个挑战。卷积神经网络(CNN)在自然图像的特征提取方面取得了前所未有的进步。然而,在自然图像的配准中不需要考虑形态差异。直接使用这些方法会导致匹配失败或超配。本文提出了一种基于无监督学习的基于ssEM图像形态学差异的表征方法。采用CNN架构提取特征。为了训练网络,使用了聚焦离子束扫描电子显微镜(FIB-SEM)图像。FIB-SEM图像是原位的,因此它们是自然配准的。对这些具有一定厚度的图像进行采样,可以教会CNN学习神经突结构的变化。学习到的特征可以直接应用于现有的ssEM图像配准方法,减少了截面厚度对配准精度的负面影响。实验结果表明,所提出的特征在匹配精度上优于目前最先进的方法,并显著改善了ssEM图像的配准结果。
{"title":"UTR: Unsupervised Learning of Thickness-Insensitive Representations for Electron Microscope Image","authors":"Tong Xin, Bohao Chen, Xi Chen, Hua Han","doi":"10.1109/ICIP42928.2021.9506597","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506597","url":null,"abstract":"Registration of serial section electron microscopy (ssEM) images is essential for neural circuit reconstruction. Morphologies of neurite structure in adjacent sections are different. Thus, it is challenging to extract valid features in ssEM image registration. Convolutional neural networks (CNN) have made unprecedented progress in feature extraction of natural images. However, morphological differences need not be considered in the registration of natural images. Directly applying these methods will result in matching failure or over-registration. This paper proposes an unsupervised learning-based representation taking the morphological differences of ssEM images into account. CNN architecture was used to extract the feature. To train the network, the focused ion beam scanning electron microscope (FIB-SEM) images are used. The FIB-SEM images are in situ, so they are naturally registered. Sampling those images with a certain thickness can teach CNN to learn changes in neurite structure. The learned feature can be directly applied to existing ssEM image registration methods and reduce the negative effect of section thickness on registration accuracy. The experimental results show that the proposed feature outperforms the state-of-the-art method in matching accuracy and significantly improves the registration outcome when used in ssEM images.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127324935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-Scale Background Suppression Anomaly Detection In Surveillance Videos 监控视频中的多尺度背景抑制异常检测
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506580
Yang Zhen, Yuanfan Guo, Jinjie Wei, Xiuguo Bao, Di Huang
Video anomaly detection has been widely applied in various surveillance systems for public security. However, the existing weakly supervised video anomaly detection methods tend to ignore the interference of the background frames and possess limited ability to extract effective temporal information among the video snippets. In this paper, a multi-scale background suppression based anomaly detection (MSBSAD) method is proposed to suppress the interference of the background frames. We propose a multi-scale temporal convolution module to effectively extract more temporal information among the video snippets for the anomaly events with different durations. A modified hinge loss is constructed in the suppression branch to help our model to better differentiate the abnormal samples from the confusing samples. Experiments on UCF Crime demonstrate the superiority of our MS-BSAD method in the video anomaly detection task.
视频异常检测已广泛应用于各种公安监控系统中。然而,现有的弱监督视频异常检测方法往往忽略背景帧的干扰,提取视频片段中有效时间信息的能力有限。本文提出了一种基于多尺度背景抑制的异常检测(MSBSAD)方法来抑制背景帧的干扰。针对不同持续时间的异常事件,提出了一种多尺度时间卷积模块,可以有效地从视频片段中提取更多的时间信息。在抑制分支中构造了一个改进的铰链损失,以帮助我们的模型更好地区分异常样本和混淆样本。针对UCF犯罪的实验证明了MS-BSAD方法在视频异常检测任务中的优越性。
{"title":"Multi-Scale Background Suppression Anomaly Detection In Surveillance Videos","authors":"Yang Zhen, Yuanfan Guo, Jinjie Wei, Xiuguo Bao, Di Huang","doi":"10.1109/ICIP42928.2021.9506580","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506580","url":null,"abstract":"Video anomaly detection has been widely applied in various surveillance systems for public security. However, the existing weakly supervised video anomaly detection methods tend to ignore the interference of the background frames and possess limited ability to extract effective temporal information among the video snippets. In this paper, a multi-scale background suppression based anomaly detection (MSBSAD) method is proposed to suppress the interference of the background frames. We propose a multi-scale temporal convolution module to effectively extract more temporal information among the video snippets for the anomaly events with different durations. A modified hinge loss is constructed in the suppression branch to help our model to better differentiate the abnormal samples from the confusing samples. Experiments on UCF Crime demonstrate the superiority of our MS-BSAD method in the video anomaly detection task.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133796640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Federated Trace: A Node Selection Method for More Efficient Federated Learning 联邦跟踪:一种更有效的联邦学习的节点选择方法
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506725
Zirui Zhu, Lifeng Sun
Federated Learning (FL) is a learning paradigm, which allows the model to directly use a large amount of data in edge devices for training without heavy communication costs and privacy leakage. An important problem that FL faced is the heterogeneity of data at different edge nodes, resulting in a lack of convergence efficiency. In this paper, we propose Federated Trace (FedTrace) to address this problem. In FedTrace, we define the time series of some performance metrics of the global model on the edge node as the training trace of this node, which can reflect the data distribution of the edge node. By clustering the training traces, we can know which nodes have similar data distribution, which can guide the selection of nodes in each round of training. Here, we use a simple but effective method, that is, randomly selecting nodes from each cluster evenly. Experiments on various settings demonstrate that our method significantly reduces the number of communication rounds required in FL.
联邦学习(FL)是一种学习范式,它允许模型直接使用边缘设备中的大量数据进行训练,而不会产生沉重的通信成本和隐私泄露。FL面临的一个重要问题是不同边缘节点数据的异构性,导致收敛效率不足。在本文中,我们提出了联邦跟踪(federaltrace)来解决这个问题。在FedTrace中,我们将全局模型在边缘节点上的一些性能指标的时间序列定义为该节点的训练轨迹,可以反映边缘节点的数据分布。通过对训练轨迹进行聚类,我们可以知道哪些节点具有相似的数据分布,从而指导每轮训练中节点的选择。在这里,我们使用一种简单而有效的方法,即从每个聚类中均匀地随机选择节点。各种设置的实验表明,我们的方法显着减少了FL所需的通信轮数。
{"title":"Federated Trace: A Node Selection Method for More Efficient Federated Learning","authors":"Zirui Zhu, Lifeng Sun","doi":"10.1109/ICIP42928.2021.9506725","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506725","url":null,"abstract":"Federated Learning (FL) is a learning paradigm, which allows the model to directly use a large amount of data in edge devices for training without heavy communication costs and privacy leakage. An important problem that FL faced is the heterogeneity of data at different edge nodes, resulting in a lack of convergence efficiency. In this paper, we propose Federated Trace (FedTrace) to address this problem. In FedTrace, we define the time series of some performance metrics of the global model on the edge node as the training trace of this node, which can reflect the data distribution of the edge node. By clustering the training traces, we can know which nodes have similar data distribution, which can guide the selection of nodes in each round of training. Here, we use a simple but effective method, that is, randomly selecting nodes from each cluster evenly. Experiments on various settings demonstrate that our method significantly reduces the number of communication rounds required in FL.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115197319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
2DTPCA: A New Framework for Multilinear Principal Component Analysis 多线性主成分分析的新框架
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506729
Cagri Ozdemir, R. Hoover, Kyle A. Caudle
Two-directional two-dimensional principal component analysis ((2D$)^{2}$PCA) has shown promising results for it’s ability to both represent and recognize facial images. The current paper extends these results into a multilinear framework (referred to as two-directional Tensor PCA or 2DTPCA for short) using a recently defined tensor operator for 3rd-order tensors. The approach proceeds by first computing a low-dimensional projection tensor for the row-space of the image data (generally referred to as mode-l) and then subsequently computing a low-dimensional projection tensor for the column space of the image data (generally referred to as mode-3). Experimental results are presented on the ORL, extended Yale-B, COIL100, and MNIST data sets that show the proposed approach outperforms traditional “ tensor-based” PCA approaches with a much smaller subspace dimension in terms of recognition rates.
双向二维主成分分析((2D$)^{2}$PCA)在表示和识别面部图像的能力方面显示出令人满意的结果。本文使用最近定义的三阶张量张量算子将这些结果扩展到一个多线性框架(称为双向张量PCA或简称2DTPCA)。该方法首先计算图像数据行空间的低维投影张量(通常称为模式- 1),然后计算图像数据列空间的低维投影张量(通常称为模式-3)。在ORL、扩展Yale-B、COIL100和MNIST数据集上的实验结果表明,所提出的方法在识别率方面优于传统的“基于张量”的PCA方法,并且子空间维度要小得多。
{"title":"2DTPCA: A New Framework for Multilinear Principal Component Analysis","authors":"Cagri Ozdemir, R. Hoover, Kyle A. Caudle","doi":"10.1109/ICIP42928.2021.9506729","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506729","url":null,"abstract":"Two-directional two-dimensional principal component analysis ((2D$)^{2}$PCA) has shown promising results for it’s ability to both represent and recognize facial images. The current paper extends these results into a multilinear framework (referred to as two-directional Tensor PCA or 2DTPCA for short) using a recently defined tensor operator for 3rd-order tensors. The approach proceeds by first computing a low-dimensional projection tensor for the row-space of the image data (generally referred to as mode-l) and then subsequently computing a low-dimensional projection tensor for the column space of the image data (generally referred to as mode-3). Experimental results are presented on the ORL, extended Yale-B, COIL100, and MNIST data sets that show the proposed approach outperforms traditional “ tensor-based” PCA approaches with a much smaller subspace dimension in terms of recognition rates.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115720431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Semantic Role Aware Correlation Transformer For Text To Video Retrieval 文本到视频检索的语义角色感知相关转换器
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506267
Burak Satar, Hongyuan Zhu, X. Bresson, J. Lim
With the emergence of social media, voluminous video clips are uploaded every day, and retrieving the most relevant visual content with a language query becomes critical. Most approaches aim to learn a joint embedding space for plain textual and visual contents without adequately exploiting their intra-modality structures and inter-modality correlations. This paper proposes a novel transformer that explicitly disentangles the text and video into semantic roles of objects, spatial contexts and temporal contexts with an attention scheme to learn the intra- and inter-role correlations among the three roles to discover discriminative features for matching at different levels. The preliminary results on popular YouCook2 indicate that our approach surpasses a current state-of-the-art method, with a high margin in all metrics. It also overpasses two SOTA methods in terms of two metrics.
随着社交媒体的出现,每天都有大量的视频片段上传,用语言查询检索最相关的视觉内容变得至关重要。大多数方法的目的是学习纯文本和视觉内容的联合嵌入空间,而没有充分利用它们的模态内结构和模态间相关性。本文提出了一种新的转换器,该转换器将文本和视频明确地分解为对象、空间上下文和时间上下文的语义角色,并使用注意方案来学习三个角色之间的角色内部和角色之间的相关性,以发现不同层次匹配的判别特征。在流行的YouCook2上的初步结果表明,我们的方法超越了目前最先进的方法,在所有指标上都有很高的利润率。它还在两个度量方面超越了两个SOTA方法。
{"title":"Semantic Role Aware Correlation Transformer For Text To Video Retrieval","authors":"Burak Satar, Hongyuan Zhu, X. Bresson, J. Lim","doi":"10.1109/ICIP42928.2021.9506267","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506267","url":null,"abstract":"With the emergence of social media, voluminous video clips are uploaded every day, and retrieving the most relevant visual content with a language query becomes critical. Most approaches aim to learn a joint embedding space for plain textual and visual contents without adequately exploiting their intra-modality structures and inter-modality correlations. This paper proposes a novel transformer that explicitly disentangles the text and video into semantic roles of objects, spatial contexts and temporal contexts with an attention scheme to learn the intra- and inter-role correlations among the three roles to discover discriminative features for matching at different levels. The preliminary results on popular YouCook2 indicate that our approach surpasses a current state-of-the-art method, with a high margin in all metrics. It also overpasses two SOTA methods in terms of two metrics.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"491 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124429191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Hierarchical Region Proposal Refinement Network for Weakly Supervised Object Detection 弱监督目标检测的层次区域建议改进网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506087
Ming Zhang, Shuaicheng Liu, Bing Zeng
Weakly supervised object detection (WSOD) has attracted more attention because it only requires image-level annotations to indicate whether a certain class exists. Most WSOD methods utilize multiple instance learning (MIL) to train an object detector where an image is treated as a bag of candidate proposals. Unlike fully supervised object detection (FSOD) that uses the object-aware region proposal network (RPN) to generate effective candidate proposals, WSOD only utilizes region proposal methods (e.g., selective search or edge boxes) due to the lack of instance-level annotations (i.e., bounding boxes). However, the quality of proposals can influence the training of the detector. To solve this problem, we propose a hierarchical region proposal refinement network (HRPRN) to refine these proposals gradually. Specifically, our network contains multiple weakly supervised detectors that are trained stage by stage. In addition, we propose an instance regression refinement model to generate object-aware coordinate offsets to refine proposals at each stage. In order to demonstrate the effectiveness of our method, we conduct experiments on PASCAL VOC 2007 dataset that is the widely used benchmark. Compared with our baseline method, online instance classifier refinement (OICR), our method achieves 9% and 5.6% improvements in terms of mAP and CorLoc, respectively.
弱监督对象检测(WSOD)由于只需要图像级别的注释来指示某个类是否存在而受到越来越多的关注。大多数WSOD方法利用多实例学习(MIL)来训练目标检测器,其中图像被视为一袋候选提案。与使用对象感知区域建议网络(RPN)生成有效候选建议的完全监督目标检测(FSOD)不同,由于缺乏实例级注释(即边界框),WSOD仅使用区域建议方法(例如,选择性搜索或边缘框)。然而,建议的质量会影响检测器的训练。为了解决这一问题,我们提出了一种分层区域建议细化网络(HRPRN)来逐步细化这些建议。具体来说,我们的网络包含多个弱监督检测器,这些检测器是逐步训练的。此外,我们提出了一个实例回归优化模型来生成对象感知坐标偏移,以在每个阶段优化提案。为了证明该方法的有效性,我们在PASCAL VOC 2007数据集上进行了实验,该数据集是广泛使用的基准。与我们的基线方法在线实例分类器改进(OICR)相比,我们的方法在mAP和CorLoc方面分别提高了9%和5.6%。
{"title":"Hierarchical Region Proposal Refinement Network for Weakly Supervised Object Detection","authors":"Ming Zhang, Shuaicheng Liu, Bing Zeng","doi":"10.1109/ICIP42928.2021.9506087","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506087","url":null,"abstract":"Weakly supervised object detection (WSOD) has attracted more attention because it only requires image-level annotations to indicate whether a certain class exists. Most WSOD methods utilize multiple instance learning (MIL) to train an object detector where an image is treated as a bag of candidate proposals. Unlike fully supervised object detection (FSOD) that uses the object-aware region proposal network (RPN) to generate effective candidate proposals, WSOD only utilizes region proposal methods (e.g., selective search or edge boxes) due to the lack of instance-level annotations (i.e., bounding boxes). However, the quality of proposals can influence the training of the detector. To solve this problem, we propose a hierarchical region proposal refinement network (HRPRN) to refine these proposals gradually. Specifically, our network contains multiple weakly supervised detectors that are trained stage by stage. In addition, we propose an instance regression refinement model to generate object-aware coordinate offsets to refine proposals at each stage. In order to demonstrate the effectiveness of our method, we conduct experiments on PASCAL VOC 2007 dataset that is the widely used benchmark. Compared with our baseline method, online instance classifier refinement (OICR), our method achieves 9% and 5.6% improvements in terms of mAP and CorLoc, respectively.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124533258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Adversarial Attack on Fake-Faces Detectors Under White and Black Box Scenarios 白盒和黑盒场景下假脸检测器的对抗性攻击
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506273
Xiying Wang, R. Ni, Wenjie Li, Yao Zhao
Generative Adversarial Network (GAN) models have been widely used in various fields. More recently, styleGAN and styleGAN2 have been developed to synthesize faces that are indistinguishable to the human eyes, which could pose a threat to public security. But latest work has shown that it is possible to identify fakes using powerful CNN networks as classifiers. However, the reliability of these techniques is unknown. Therefore, in this paper we focus on the generation of content-preserving images from fake faces to spoof classifiers. Two GAN-based frameworks are proposed to achieve the goal in the white-box and black-box. For the white-box, a network without up/down sampling is proposed to generate face images to confuse the classifier. In the black-box scenario (where the classifier is unknown), real data is introduced as a guidance for GAN structure to make it adversarial, and a Real Extractor as an auxiliary network to constrain the feature distance between the generated images and the real data to enhance the adversarial capability. Experimental results show that the proposed method effectively reduces the detection accuracy of forensic models with good transferability.
生成对抗网络(GAN)模型已广泛应用于各个领域。最近,styleGAN和styleGAN2已经被开发出来,可以合成人眼无法分辨的人脸,这可能会对公共安全构成威胁。但最新的研究表明,使用强大的CNN网络作为分类器来识别假货是可能的。然而,这些技术的可靠性是未知的。因此,在本文中,我们专注于从假人脸到欺骗分类器的内容保留图像的生成。提出了两种基于gan的框架来实现白盒和黑盒的目标。对于白盒,提出了一个没有上下采样的网络来生成人脸图像,以混淆分类器。在黑箱场景(分类器未知)中,引入真实数据作为GAN结构的指导,使其具有对抗性,并引入real Extractor作为辅助网络,约束生成图像与真实数据之间的特征距离,以增强对抗能力。实验结果表明,该方法有效地降低了具有良好可移植性的法医模型的检测精度。
{"title":"Adversarial Attack on Fake-Faces Detectors Under White and Black Box Scenarios","authors":"Xiying Wang, R. Ni, Wenjie Li, Yao Zhao","doi":"10.1109/ICIP42928.2021.9506273","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506273","url":null,"abstract":"Generative Adversarial Network (GAN) models have been widely used in various fields. More recently, styleGAN and styleGAN2 have been developed to synthesize faces that are indistinguishable to the human eyes, which could pose a threat to public security. But latest work has shown that it is possible to identify fakes using powerful CNN networks as classifiers. However, the reliability of these techniques is unknown. Therefore, in this paper we focus on the generation of content-preserving images from fake faces to spoof classifiers. Two GAN-based frameworks are proposed to achieve the goal in the white-box and black-box. For the white-box, a network without up/down sampling is proposed to generate face images to confuse the classifier. In the black-box scenario (where the classifier is unknown), real data is introduced as a guidance for GAN structure to make it adversarial, and a Real Extractor as an auxiliary network to constrain the feature distance between the generated images and the real data to enhance the adversarial capability. Experimental results show that the proposed method effectively reduces the detection accuracy of forensic models with good transferability.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114366351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Nuclear Density Distribution Feature for Improving Cervical Histopathological Images Recognition 核密度分布特征提高宫颈组织病理图像识别
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506093
Zhuangzhuang Wang, Mengning Yang, Yangfan Lyu, Kairun Chen, Qicheng Tang
Cervical carcinoma is a common type of cancer in the female reproductive system. Early detection and diagnosis can facilitate immediate treatment and prevent progression of the disease. However, in order to achieve better performance, DL-based algorithms just stack various layers with low interpretability. In this paper, a robust and reliable Nuclear Density Distribution Feature (NDDF) based on priors of the pathologists to promote the Cervical Histopathological Image Classification (CHIC) is proposed. Our proposed method combines the nucleus mask segmented by U-Net with the segmentation grid-lines generated from pathology images utilizing SLIC to obtain the NDDF map, which contains information about the morphology, size, number, and spatial distribution of nuclei. The result shows that the proposed model trained with NDDF maps has better performance and accuracy than that trained on RGB images (patch-level histopathological images). More significantly, the accuracy of the two-stream network trained with RGB images and NDDF maps is steadily improved over the corresponding baselines of different complexity.
宫颈癌是女性生殖系统中一种常见的癌症。早期发现和诊断可以促进立即治疗并防止疾病的发展。然而,为了获得更好的性能,基于dl的算法只是将各种层叠加在一起,可解释性很低。本文提出了一种基于病理学家先验信息的鲁棒可靠的核密度分布特征(NDDF)来促进宫颈组织病理图像分类(CHIC)。我们提出的方法将U-Net分割的细胞核掩膜与利用SLIC从病理图像中生成的分割网格线相结合,获得包含细胞核形态、大小、数量和空间分布信息的NDDF图。结果表明,使用NDDF图训练的模型比使用RGB图像(斑块级组织病理学图像)训练的模型具有更好的性能和准确性。更重要的是,在不同复杂度的相应基线上,RGB图像和NDDF地图训练的两流网络的准确率稳步提高。
{"title":"Nuclear Density Distribution Feature for Improving Cervical Histopathological Images Recognition","authors":"Zhuangzhuang Wang, Mengning Yang, Yangfan Lyu, Kairun Chen, Qicheng Tang","doi":"10.1109/ICIP42928.2021.9506093","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506093","url":null,"abstract":"Cervical carcinoma is a common type of cancer in the female reproductive system. Early detection and diagnosis can facilitate immediate treatment and prevent progression of the disease. However, in order to achieve better performance, DL-based algorithms just stack various layers with low interpretability. In this paper, a robust and reliable Nuclear Density Distribution Feature (NDDF) based on priors of the pathologists to promote the Cervical Histopathological Image Classification (CHIC) is proposed. Our proposed method combines the nucleus mask segmented by U-Net with the segmentation grid-lines generated from pathology images utilizing SLIC to obtain the NDDF map, which contains information about the morphology, size, number, and spatial distribution of nuclei. The result shows that the proposed model trained with NDDF maps has better performance and accuracy than that trained on RGB images (patch-level histopathological images). More significantly, the accuracy of the two-stream network trained with RGB images and NDDF maps is steadily improved over the corresponding baselines of different complexity.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114505107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation-Aware Text-Guided Image Manipulation 分割感知文本引导图像处理
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506601
T. Haruyama, Ren Togo, Keisuke Maeda, Takahiro Ogawa, M. Haseyama
We propose a novel approach that improves text-guided image manipulation performance in this paper. Text-guided image manipulation aims at modifying some parts of an input image in accordance with the user’s text description by semantically associating the regions of the image with the text description. We tackle the conventional methods’ problem of modifying undesired parts caused by differences in representation ability between text descriptions and images. Humans tend to pay attention primarily to objects corresponding to the foreground of images, and text descriptions by humans mostly represent the foreground. Therefore, it is necessary to introduce not only a foreground-aware bias based on text descriptions but also a background-aware bias that the text descriptions do not represent. We introduce an image segmentation network into the generative adversarial network for image manipulation to solve the above problem. Comparative experiments with three state-of-the-art methods show the effectiveness of our method quantitatively and qualitatively.
本文提出了一种改进文本引导图像处理性能的新方法。文本引导的图像处理旨在通过将图像的区域与文本描述在语义上相关联,从而根据用户的文本描述修改输入图像的某些部分。我们解决了传统方法中由于文本描述和图像的表达能力不同而导致的不需要的部分的修改问题。人类往往主要关注与图像前景相对应的物体,人类的文字描述多代表前景。因此,有必要不仅引入基于文本描述的前景感知偏差,而且引入文本描述不代表的背景感知偏差。为了解决上述问题,我们在生成对抗网络中引入了图像分割网络。与三种最先进的方法的对比实验表明,我们的方法在定量和定性上都是有效的。
{"title":"Segmentation-Aware Text-Guided Image Manipulation","authors":"T. Haruyama, Ren Togo, Keisuke Maeda, Takahiro Ogawa, M. Haseyama","doi":"10.1109/ICIP42928.2021.9506601","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506601","url":null,"abstract":"We propose a novel approach that improves text-guided image manipulation performance in this paper. Text-guided image manipulation aims at modifying some parts of an input image in accordance with the user’s text description by semantically associating the regions of the image with the text description. We tackle the conventional methods’ problem of modifying undesired parts caused by differences in representation ability between text descriptions and images. Humans tend to pay attention primarily to objects corresponding to the foreground of images, and text descriptions by humans mostly represent the foreground. Therefore, it is necessary to introduce not only a foreground-aware bias based on text descriptions but also a background-aware bias that the text descriptions do not represent. We introduce an image segmentation network into the generative adversarial network for image manipulation to solve the above problem. Comparative experiments with three state-of-the-art methods show the effectiveness of our method quantitatively and qualitatively.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114710598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Partitioned Centerpose Network for Bottom-Up Multi-Person Pose Estimation 自底向上多人姿态估计的分区中心姿态网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506555
Jiahua Wu, H. Lee
In bottom-up multi-person pose estimation method, grouping joint candidates into corresponding person instance is a challenging problem. In this paper, a new bottom-up method, Partitioned CenterPose (PCP) Network, is proposed to better cluster all detected joints. To achieve this goal, a novel Partition Pose Representation (PPR) is proposed which integrate person instance and body joint by joint offset. PPR leverages the center of human body and the offset between center point and body joint to encode human pose. To better enhance the relationship of body joints, we divide human body into five parts, and generate sub-PPR in each part. Based on PPR, PCP Network can detect persons and body joints simultaneously, and then grouping all body joints by joint offset. Moreover, an improved $ell_{1}$ loss is designed to obtain more accurate joint offset. On the COCO keypoints dataset, the proposed method performs on par with the existing state-of-the-art bottom-up method in accuracy and speed.
在自下而上的多人姿态估计方法中,将联合候选对象分组到相应的人实例中是一个具有挑战性的问题。本文提出了一种新的自底向上的方法——Partitioned CenterPose (PCP) Network,以更好地聚类所有检测到的关节。为了实现这一目标,提出了一种新的分区姿态表示(PPR)方法,该方法通过关节偏移量将人实例和身体关节相结合。PPR利用人体的中心和中心点与身体关节之间的偏移量来编码人体姿势。为了更好地加强人体关节的关系,我们将人体分为五个部分,并在每个部分生成子ppr。基于PPR, PCP网络可以同时检测人和身体关节,然后根据关节偏移量对所有身体关节进行分组。此外,设计了改进的$ell_{1}$损耗,以获得更精确的关节偏移量。在COCO关键点数据集上,该方法在精度和速度上与现有的最先进的自下而上方法相当。
{"title":"Partitioned Centerpose Network for Bottom-Up Multi-Person Pose Estimation","authors":"Jiahua Wu, H. Lee","doi":"10.1109/ICIP42928.2021.9506555","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506555","url":null,"abstract":"In bottom-up multi-person pose estimation method, grouping joint candidates into corresponding person instance is a challenging problem. In this paper, a new bottom-up method, Partitioned CenterPose (PCP) Network, is proposed to better cluster all detected joints. To achieve this goal, a novel Partition Pose Representation (PPR) is proposed which integrate person instance and body joint by joint offset. PPR leverages the center of human body and the offset between center point and body joint to encode human pose. To better enhance the relationship of body joints, we divide human body into five parts, and generate sub-PPR in each part. Based on PPR, PCP Network can detect persons and body joints simultaneously, and then grouping all body joints by joint offset. Moreover, an improved $ell_{1}$ loss is designed to obtain more accurate joint offset. On the COCO keypoints dataset, the proposed method performs on par with the existing state-of-the-art bottom-up method in accuracy and speed.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116926316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1