Contrastive Learning with Attention Mechanism and Multi-Scale Sample Network for Unpaired Image-to-Image Translation

2023 IEEE Symposium on Computers and Communications (ISCC) Pub Date : 2023-07-09 DOI:10.1109/ISCC58397.2023.10218053

Yunhao Liu, Songyi Zhong, Zhenglin Li, Yangqiaoyu Zhou

{"title":"Contrastive Learning with Attention Mechanism and Multi-Scale Sample Network for Unpaired Image-to-Image Translation","authors":"Yunhao Liu, Songyi Zhong, Zhenglin Li, Yangqiaoyu Zhou","doi":"10.1109/ISCC58397.2023.10218053","DOIUrl":null,"url":null,"abstract":"The aim of unpaired image translation is to learn how to transform images from a source to a target domain, while preserving as many domain-invariant features as possible. Previous methods have not been able to separate foreground and background well, resulting in texture being added to the background. Moreover, these methods often fail to distinguish different objects or different parts of the same object. In this paper, we propose an attention-based generator (AG) that can redistribute the weights of visual features, significantly enhancing the network's performance in separating foreground and background. We also embed a multi-scale multilayer perceptron (MSMLP) into the framework to capture features across a broader range of scales, which improves the discrimination of various parts of objects. Our method outperforms existing methods on various datasets in terms of Fréchet inception distance. We further analyze the impact of different modules in our approach through subsequent ablation experiments.","PeriodicalId":265337,"journal":{"name":"2023 IEEE Symposium on Computers and Communications (ISCC)","volume":"44 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Symposium on Computers and Communications (ISCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCC58397.2023.10218053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The aim of unpaired image translation is to learn how to transform images from a source to a target domain, while preserving as many domain-invariant features as possible. Previous methods have not been able to separate foreground and background well, resulting in texture being added to the background. Moreover, these methods often fail to distinguish different objects or different parts of the same object. In this paper, we propose an attention-based generator (AG) that can redistribute the weights of visual features, significantly enhancing the network's performance in separating foreground and background. We also embed a multi-scale multilayer perceptron (MSMLP) into the framework to capture features across a broader range of scales, which improves the discrimination of various parts of objects. Our method outperforms existing methods on various datasets in terms of Fréchet inception distance. We further analyze the impact of different modules in our approach through subsequent ablation experiments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于注意机制和多尺度样本网络的非配对图像到图像翻译的对比学习

非配对图像转换的目的是学习如何将图像从源域转换到目标域，同时尽可能多地保留域不变特征。以前的方法不能很好地分离前景和背景，导致纹理被添加到背景中。此外，这些方法往往无法区分不同的物体或同一物体的不同部分。在本文中，我们提出了一种基于注意力的生成器(attention-based generator, AG)，它可以重新分配视觉特征的权重，显著提高了网络在前景和背景分离方面的性能。我们还在框架中嵌入了一个多尺度多层感知器(MSMLP)，以捕获更大尺度范围内的特征，从而提高了对物体各个部分的识别。我们的方法在各种数据集上都优于现有的方法。我们通过随后的烧蚀实验进一步分析了不同模块对我们方法的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 IEEE Symposium on Computers and Communications (ISCC)

自引率

0.00%

发文量