参考柔性图像恢复

IF 9.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2025-05-15 Epub Date: 2025-02-24 DOI:10.1016/j.eswa.2025.126857
Runwei Guan , Rongsheng Hu , Zhuhao Zhou , Tianlang Xue , Ka Lok Man , Jeremy Smith , Eng Gee Lim , Weiping Ding , Yutao Yue
{"title":"参考柔性图像恢复","authors":"Runwei Guan ,&nbsp;Rongsheng Hu ,&nbsp;Zhuhao Zhou ,&nbsp;Tianlang Xue ,&nbsp;Ka Lok Man ,&nbsp;Jeremy Smith ,&nbsp;Eng Gee Lim ,&nbsp;Weiping Ding ,&nbsp;Yutao Yue","doi":"10.1016/j.eswa.2025.126857","DOIUrl":null,"url":null,"abstract":"<div><div>In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised modules, Multi-Head Agent Self-Attention (MHASA) for multi-degradation context modeling and Multi-Head Agent Cross Attention (MHACA) for efficient alignment between prompt and referred degradations, where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtain competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective basic structure for image restoration. We release our project at <span><span>https://github.com/GuanRunwei/FIR-CP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126857"},"PeriodicalIF":9.4000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Referring flexible image restoration\",\"authors\":\"Runwei Guan ,&nbsp;Rongsheng Hu ,&nbsp;Zhuhao Zhou ,&nbsp;Tianlang Xue ,&nbsp;Ka Lok Man ,&nbsp;Jeremy Smith ,&nbsp;Eng Gee Lim ,&nbsp;Weiping Ding ,&nbsp;Yutao Yue\",\"doi\":\"10.1016/j.eswa.2025.126857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised modules, Multi-Head Agent Self-Attention (MHASA) for multi-degradation context modeling and Multi-Head Agent Cross Attention (MHACA) for efficient alignment between prompt and referred degradations, where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtain competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective basic structure for image restoration. We release our project at <span><span>https://github.com/GuanRunwei/FIR-CP</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"274 \",\"pages\":\"Article 126857\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425004798\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425004798","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在现实中,图像经常表现出多重退化,例如晚上的雨和雾(三重退化)。然而,在许多情况下,个人可能不想去除所有的退化,例如,一个模糊的镜头显示一个美丽的雪景(双重退化)。在这种情况下,人们可能只想去模糊化。这些情况和需求揭示了图像恢复中的新挑战,其中模型必须感知和删除具有多重退化的图像中由人类命令指定的特定退化类型。我们将此任务称为参考柔性图像恢复(RFIR)。为了解决这个问题,我们首先构建了一个名为RFIR的大规模合成数据集,该数据集包含153,423个样本,其中包含降级图像、特定降级去除的文本提示和恢复图像。RFIR包括五种基本的降解类型:模糊、雨、雾霾、弱光和雪,同时还包括六个主要的子类别,用于不同程度的降解去除。为了应对这一挑战,我们提出了一种新的基于转换器的多任务模型TransRFIR,该模型可以同时感知退化图像中的退化类型,并根据文本提示去除特定的退化。TransRFIR基于两个设计模块,用于多退化上下文建模的多头代理自注意(MHASA)和用于提示和参考退化之间有效匹配的多头代理交叉注意(MHACA),其中MHASA和MHACA引入代理令牌并达到线性复杂度,实现了比传统的自注意和交叉注意更低的计算成本,并获得了具有竞争力的性能。与其他同类产品相比,TransRFIR具有最先进的性能,并被证明是有效的图像恢复基本结构。我们在https://github.com/GuanRunwei/FIR-CP发布我们的项目。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Referring flexible image restoration
In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised modules, Multi-Head Agent Self-Attention (MHASA) for multi-degradation context modeling and Multi-Head Agent Cross Attention (MHACA) for efficient alignment between prompt and referred degradations, where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtain competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective basic structure for image restoration. We release our project at https://github.com/GuanRunwei/FIR-CP.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
期刊最新文献
PDGAGRN: Graph diffusion pretraining and dynamic graph learning for gene regulatory network inference from single-cell RNA-sequencing data LDGC3: Learnable deep graph contrastive clustering with triple cluster-structure awareness MOSS‑GAN: a GAN‑enhanced Mamba model with spatial‑spectral co‑optimization for nearshore green tide detection in UAV hyperspectral imagery Visual tracking method with hybrid spatio-temporal backbone network and dual-memory mechanism Developing a totally unimodular linear program for optimal conformance checking: When and why it complements A*
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1