LDDG：用于共锯齿物体检测的长距离依赖和双流引导特征融合网络

IF 3.9 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-09-01 Epub Date: 2024-06-04 DOI:10.1016/j.displa.2024.102767

Longsheng Wei , Siyuan Guo , Jiu Huang , Xuan Fan

{"title":"LDDG：用于共锯齿物体检测的长距离依赖和双流引导特征融合网络","authors":"Longsheng Wei , Siyuan Guo , Jiu Huang , Xuan Fan","doi":"10.1016/j.displa.2024.102767","DOIUrl":null,"url":null,"abstract":"<div><p>Complex image scenes are a challenge in the collaborative saliency object detection task in the field of saliency detection, such as the inability to accurately locate salient object, surrounding background information affecting object recognition, and the inability to fuse multi-layer collaborative features well. To solve these problems, we propose a long-range dependent and dual-stream guided feature fusion network. Firstly, we enhance saliency feature by the proposed coordinate attention module so that the network can learn a better feature representation. Secondly, we capture the long-range dependency information of image feature by the proposed non-local module, to obtain more comprehensive contextual complex information. At lastly, we propose a dual-stream guided network to fuse multiple layers of synergistic saliency features. The dual-stream guided network includes classification streams and mask streams, and the layers in the decoding network are guided to fuse the feature of each layer to output more accurate synoptic saliency prediction map. The experimental results show that our method is superior to the existing methods on three common datasets: CoSal2015, CoSOD3k, and CoCA.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102767"},"PeriodicalIF":3.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LDDG: Long-distance dependent and dual-stream guided feature fusion network for co-saliency object detection\",\"authors\":\"Longsheng Wei , Siyuan Guo , Jiu Huang , Xuan Fan\",\"doi\":\"10.1016/j.displa.2024.102767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Complex image scenes are a challenge in the collaborative saliency object detection task in the field of saliency detection, such as the inability to accurately locate salient object, surrounding background information affecting object recognition, and the inability to fuse multi-layer collaborative features well. To solve these problems, we propose a long-range dependent and dual-stream guided feature fusion network. Firstly, we enhance saliency feature by the proposed coordinate attention module so that the network can learn a better feature representation. Secondly, we capture the long-range dependency information of image feature by the proposed non-local module, to obtain more comprehensive contextual complex information. At lastly, we propose a dual-stream guided network to fuse multiple layers of synergistic saliency features. The dual-stream guided network includes classification streams and mask streams, and the layers in the decoding network are guided to fuse the feature of each layer to output more accurate synoptic saliency prediction map. The experimental results show that our method is superior to the existing methods on three common datasets: CoSal2015, CoSOD3k, and CoCA.</p></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"84 \",\"pages\":\"Article 102767\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938224001318\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/6/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001318","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

在显著性检测领域，复杂图像场景是协同显著性物体检测任务中的一个难题，如无法准确定位显著性物体、周围背景信息影响物体识别以及无法很好地融合多层协同特征等。为了解决这些问题，我们提出了一种长距离依赖和双流引导的特征融合网络。首先，我们通过所提出的协调注意模块来增强显著性特征，从而使网络能够学习到更好的特征表示。其次，通过非局部模块捕捉图像特征的长程依赖信息，从而获得更全面的上下文复合信息。最后，我们提出了双流引导网络来融合多层协同突出特征。双流引导网络包括分类流和掩码流，解码网络中的各层在引导下融合各层的特征，从而输出更准确的显著性同步预测图。实验结果表明，在三个常见数据集上，我们的方法优于现有方法：CoSal2015、CoSOD3k 和 CoCA。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LDDG: Long-distance dependent and dual-stream guided feature fusion network for co-saliency object detection

Complex image scenes are a challenge in the collaborative saliency object detection task in the field of saliency detection, such as the inability to accurately locate salient object, surrounding background information affecting object recognition, and the inability to fuse multi-layer collaborative features well. To solve these problems, we propose a long-range dependent and dual-stream guided feature fusion network. Firstly, we enhance saliency feature by the proposed coordinate attention module so that the network can learn a better feature representation. Secondly, we capture the long-range dependency information of image feature by the proposed non-local module, to obtain more comprehensive contextual complex information. At lastly, we propose a dual-stream guided network to fuse multiple layers of synergistic saliency features. The dual-stream guided network includes classification streams and mask streams, and the layers in the decoding network are guided to fuse the feature of each layer to output more accurate synoptic saliency prediction map. The experimental results show that our method is superior to the existing methods on three common datasets: CoSal2015, CoSOD3k, and CoCA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.