A hybrid attention multi-scale fusion network for real-time semantic segmentation.

IF 3.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Scientific Reports Pub Date : 2025-01-06 DOI:10.1038/s41598-024-84685-6
Baofeng Ye, Renzheng Xue, Qianlong Wu
{"title":"A hybrid attention multi-scale fusion network for real-time semantic segmentation.","authors":"Baofeng Ye, Renzheng Xue, Qianlong Wu","doi":"10.1038/s41598-024-84685-6","DOIUrl":null,"url":null,"abstract":"<p><p>In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-time inference speed. This paper proposes a new method to address this issue. Specifically, we have designed a new module (HFRM) that combines channel attention and spatial attention to retrieve the spatial information lost during downsampling and enhance object classification accuracy. Regarding fusing spatial and semantic information, we have designed a new module (HFFM) to merge features of two different levels more effectively and capture a larger receptive field through an attention mechanism. Additionally, edge detection methods have been incorporated to enhance the extraction of boundary information. Experimental results demonstrate that for an input size of 512 × 1024, our proposed method achieves 73.6% mIoU at 176 frames per second (FPS) on the Cityscapes dataset and 70.0% mIoU at 146 FPS on Camvid. Compared to existing networks, our Model achieves faster inference speed while maintaining accuracy, enhancing its practicality.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"872"},"PeriodicalIF":3.9000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-024-84685-6","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-time inference speed. This paper proposes a new method to address this issue. Specifically, we have designed a new module (HFRM) that combines channel attention and spatial attention to retrieve the spatial information lost during downsampling and enhance object classification accuracy. Regarding fusing spatial and semantic information, we have designed a new module (HFFM) to merge features of two different levels more effectively and capture a larger receptive field through an attention mechanism. Additionally, edge detection methods have been incorporated to enhance the extraction of boundary information. Experimental results demonstrate that for an input size of 512 × 1024, our proposed method achieves 73.6% mIoU at 176 frames per second (FPS) on the Cityscapes dataset and 70.0% mIoU at 146 FPS on Camvid. Compared to existing networks, our Model achieves faster inference speed while maintaining accuracy, enhancing its practicality.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种用于实时语义分割的混合注意力多尺度融合网络。
在语义分割研究中,空间信息和感受野是必不可少的。然而,目前大多数算法都侧重于获取语义信息,而丢失了大量的空间信息,导致在提高实时推理速度的同时,准确率显著下降。本文提出了一种解决这一问题的新方法。具体而言,我们设计了一个新的HFRM模块,该模块结合了信道注意和空间注意,以检索下采样过程中丢失的空间信息,提高目标分类精度。在空间和语义信息融合方面,我们设计了一个新的模块(HFFM),通过注意机制更有效地融合两个不同层次的特征,并捕获更大的接受场。此外,还引入了边缘检测方法来增强边界信息的提取。实验结果表明,当输入尺寸为512 × 1024时,我们提出的方法在cityscape数据集上以176帧每秒(FPS)的速度实现了73.6%的mIoU,在Camvid上以146帧每秒(FPS)的速度实现了70.0%的mIoU。与现有的网络相比,我们的模型在保持准确性的同时实现了更快的推理速度,增强了其实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Scientific Reports
Scientific Reports Natural Science Disciplines-
CiteScore
7.50
自引率
4.30%
发文量
19567
审稿时长
3.9 months
期刊介绍: We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.
期刊最新文献
Kinematic and aerodynamic modeling of flexible wings with wing root adjustment for flapping wing micro aerial vehicles. Mpox in people living with and without HIV, including people on PrEP, during a multistate outbreak in Spain in 2022. Nitric oxide induces p53-mediated cell death in human nasal epithelial cells. Correlation based feature importance analysis for improving machine learning stability predictions in hybrid PV systems. Research on enhancing short-term wind power forecasting through feature fusion in a hybrid deep learning framework.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1