Unified Image Harmonization with Region Augmented Attention Normalization

Q1 Decision Sciences Annals of Data Science Pub Date : 2024-05-11 DOI:10.1007/s40745-024-00531-6
Junjie Hou, Yuqi Zhang, Duo Su
{"title":"Unified Image Harmonization with Region Augmented Attention Normalization","authors":"Junjie Hou,&nbsp;Yuqi Zhang,&nbsp;Duo Su","doi":"10.1007/s40745-024-00531-6","DOIUrl":null,"url":null,"abstract":"<div><p>The image harmonization task endeavors to adjust foreground information within an image synthesis process to achieve visual consistency by leveraging background information. In academic research, this task conventionally involves the utilization of simple synthesized images and matching masks as inputs. However, obtaining precise masks for image harmonization in practical applications poses a significant challenge, thereby creating a notable disparity between research findings and real-world applicability. To mitigate this disparity, we propose a redefinition of the image harmonization task as “Unified Image Harmonization,” where the input comprises only a single image, thereby enhancing its applicability in real-world scenarios. To address this challenge, we have developed a novel framework. Within this framework, we initially employ inharmonious region localization to detect the mask, which is subsequently utilized for harmonization tasks. The pivotal aspect of the harmonization process lies in normalization, which is accountable for information transfer. Nonetheless, the current background-to-foreground information transfer and guidance mechanisms are limited by single-layer guidance, thereby constraining their effectiveness. To overcome this limitation, we introduce Region Augmented Attention Normalization (RA2N), which enhances the attention mechanism for foreground feature alignment, consequently leading to improved alignment and transfer capabilities. Through qualitative and quantitative comparisons on the iHarmony4 dataset, our model exhibits exceptional performance not only in unified image harmonization but also in conventional image harmonization tasks.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00531-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

The image harmonization task endeavors to adjust foreground information within an image synthesis process to achieve visual consistency by leveraging background information. In academic research, this task conventionally involves the utilization of simple synthesized images and matching masks as inputs. However, obtaining precise masks for image harmonization in practical applications poses a significant challenge, thereby creating a notable disparity between research findings and real-world applicability. To mitigate this disparity, we propose a redefinition of the image harmonization task as “Unified Image Harmonization,” where the input comprises only a single image, thereby enhancing its applicability in real-world scenarios. To address this challenge, we have developed a novel framework. Within this framework, we initially employ inharmonious region localization to detect the mask, which is subsequently utilized for harmonization tasks. The pivotal aspect of the harmonization process lies in normalization, which is accountable for information transfer. Nonetheless, the current background-to-foreground information transfer and guidance mechanisms are limited by single-layer guidance, thereby constraining their effectiveness. To overcome this limitation, we introduce Region Augmented Attention Normalization (RA2N), which enhances the attention mechanism for foreground feature alignment, consequently leading to improved alignment and transfer capabilities. Through qualitative and quantitative comparisons on the iHarmony4 dataset, our model exhibits exceptional performance not only in unified image harmonization but also in conventional image harmonization tasks.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用区域增强注意力归一化统一图像协调
图像协调任务致力于在图像合成过程中调整前景信息,通过利用背景信息实现视觉一致性。在学术研究中,这项任务通常使用简单的合成图像和匹配掩码作为输入。然而,在实际应用中,为图像协调获取精确的遮罩是一项巨大的挑战,从而造成了研究成果与实际应用之间的明显差距。为了缩小这种差距,我们建议将图像协调任务重新定义为 "统一图像协调",即输入只包括一张图像,从而提高其在现实世界中的适用性。为了应对这一挑战,我们开发了一个新颖的框架。在这一框架内,我们首先利用不和谐区域定位来检测掩码,然后利用掩码进行协调任务。协调过程的关键在于归一化,它负责信息传递。然而,目前从背景到前景的信息传输和引导机制受到单层引导的限制,从而制约了其有效性。为了克服这一局限性,我们引入了区域增强注意归一化(RA2N),它增强了前景特征配准的注意机制,从而提高了配准和传输能力。通过在 iHarmony4 数据集上进行定性和定量比较,我们的模型不仅在统一图像协调方面,而且在传统图像协调任务中都表现出了卓越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Annals of Data Science
Annals of Data Science Decision Sciences-Statistics, Probability and Uncertainty
CiteScore
6.50
自引率
0.00%
发文量
93
期刊介绍: Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed.     ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.
期刊最新文献
Non-negative Sparse Matrix Factorization for Soft Clustering of Territory Risk Analysis Kernel Method for Estimating Matusita Overlapping Coefficient Using Numerical Approximations Maximum Likelihood Estimation for Generalized Inflated Power Series Distributions Farm-Level Smart Crop Recommendation Framework Using Machine Learning Reaction Function for Financial Market Reacting to Events or Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1