MSI-DTrans:利用多层语义交互和动态变换器进行多焦点图像融合

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-09-17 DOI:10.1016/j.displa.2024.102837
Hao Zhai, Yuncan Ouyang, Nannan Luo, Lianhua Chen, Zhi Zeng
{"title":"MSI-DTrans:利用多层语义交互和动态变换器进行多焦点图像融合","authors":"Hao Zhai,&nbsp;Yuncan Ouyang,&nbsp;Nannan Luo,&nbsp;Lianhua Chen,&nbsp;Zhi Zeng","doi":"10.1016/j.displa.2024.102837","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-focus image fusion (MFIF) aims to utilize multiple images with different focal lengths to fuse into a single full-focus image. This process enhances the realism and clarity of the resulting image. In this paper, a MFIF method called MSI-DTrans was proposed. On the one hand, in order to fully utilize all the effective information that the source image carries, the proposed method adopts a multilayer semantic interaction strategy to enhance the interaction of high-frequency and low-frequency information. This approach gradually mines more abstract semantic information, guiding the generation of feature maps from coarse to fine. On the other hand, a parallel multi-scale joint self-attention computation model is designed. The model adopts dynamic sense field and dynamic token embedding to overcome the performance degradation problem when dealing with multi-scale objects. This enables self-attention to integrate long-range dependencies between objects of different scales and reduces computational overhead. Numerous experimental results show that the proposed method effectively avoids image distortion, achieves better visualization results, and demonstrates good competitiveness with many state-of-the-art methods in terms of qualitative and quantitative analysis, as well as efficiency comparison. The source code is available at <span><span>https://github.com/ouyangbaicai/MSI-DTrans</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102837"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSI-DTrans: A multi-focus image fusion using multilayer semantic interaction and dynamic transformer\",\"authors\":\"Hao Zhai,&nbsp;Yuncan Ouyang,&nbsp;Nannan Luo,&nbsp;Lianhua Chen,&nbsp;Zhi Zeng\",\"doi\":\"10.1016/j.displa.2024.102837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multi-focus image fusion (MFIF) aims to utilize multiple images with different focal lengths to fuse into a single full-focus image. This process enhances the realism and clarity of the resulting image. In this paper, a MFIF method called MSI-DTrans was proposed. On the one hand, in order to fully utilize all the effective information that the source image carries, the proposed method adopts a multilayer semantic interaction strategy to enhance the interaction of high-frequency and low-frequency information. This approach gradually mines more abstract semantic information, guiding the generation of feature maps from coarse to fine. On the other hand, a parallel multi-scale joint self-attention computation model is designed. The model adopts dynamic sense field and dynamic token embedding to overcome the performance degradation problem when dealing with multi-scale objects. This enables self-attention to integrate long-range dependencies between objects of different scales and reduces computational overhead. Numerous experimental results show that the proposed method effectively avoids image distortion, achieves better visualization results, and demonstrates good competitiveness with many state-of-the-art methods in terms of qualitative and quantitative analysis, as well as efficiency comparison. The source code is available at <span><span>https://github.com/ouyangbaicai/MSI-DTrans</span><svg><path></path></svg></span>.</p></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"85 \",\"pages\":\"Article 102837\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938224002014\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224002014","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

多焦图像融合(MFIF)旨在利用不同焦距的多幅图像融合成一幅全焦图像。这一过程可增强图像的真实感和清晰度。本文提出了一种名为 MSI-DTrans 的 MFIF 方法。一方面,为了充分利用源图像所携带的所有有效信息,所提出的方法采用了多层语义交互策略,以增强高频和低频信息的交互。这种方法能逐渐挖掘出更多抽象的语义信息,引导特征图的生成由粗到细。另一方面,设计了一个并行的多尺度联合自我注意计算模型。该模型采用动态感知场和动态标记嵌入来克服处理多尺度对象时的性能下降问题。这使得自注意能够整合不同尺度对象之间的长距离依赖关系,并减少计算开销。大量实验结果表明,所提出的方法有效地避免了图像失真,实现了更好的可视化效果,并在定性和定量分析以及效率比较方面与许多最先进的方法相比具有良好的竞争力。源代码见 https://github.com/ouyangbaicai/MSI-DTrans。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MSI-DTrans: A multi-focus image fusion using multilayer semantic interaction and dynamic transformer

Multi-focus image fusion (MFIF) aims to utilize multiple images with different focal lengths to fuse into a single full-focus image. This process enhances the realism and clarity of the resulting image. In this paper, a MFIF method called MSI-DTrans was proposed. On the one hand, in order to fully utilize all the effective information that the source image carries, the proposed method adopts a multilayer semantic interaction strategy to enhance the interaction of high-frequency and low-frequency information. This approach gradually mines more abstract semantic information, guiding the generation of feature maps from coarse to fine. On the other hand, a parallel multi-scale joint self-attention computation model is designed. The model adopts dynamic sense field and dynamic token embedding to overcome the performance degradation problem when dealing with multi-scale objects. This enables self-attention to integrate long-range dependencies between objects of different scales and reduces computational overhead. Numerous experimental results show that the proposed method effectively avoids image distortion, achieves better visualization results, and demonstrates good competitiveness with many state-of-the-art methods in terms of qualitative and quantitative analysis, as well as efficiency comparison. The source code is available at https://github.com/ouyangbaicai/MSI-DTrans.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
A assessment method for ergonomic risk based on fennec fox optimization algorithm and generalized regression neural network An overview of bit-depth enhancement: Algorithm datasets and evaluation No-reference underwater image quality assessment based on Multi-Scale and mutual information analysis DHDP-SLAM: Dynamic Hierarchical Dirichlet Process based data association for semantic SLAM Fabrication and Reflow of Indium Bumps for Active-Matrix Micro-LED Display of 3175 PPI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1