EMBANet: A flexible efficient multi-branch attention network

IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Networks Pub Date : 2025-02-11 DOI:10.1016/j.neunet.2025.107248
Keke Zu , Hu Zhang , Lei Zhang , Jian Lu , Chen Xu , Hongyang Chen , Yu Zheng
{"title":"EMBANet: A flexible efficient multi-branch attention network","authors":"Keke Zu ,&nbsp;Hu Zhang ,&nbsp;Lei Zhang ,&nbsp;Jian Lu ,&nbsp;Chen Xu ,&nbsp;Hongyang Chen ,&nbsp;Yu Zheng","doi":"10.1016/j.neunet.2025.107248","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advances in the design of convolutional neural networks have shown that performance can be enhanced by improving the ability to represent multi-scale features. However, most existing methods either focus on designing more sophisticated attention modules, which leads to higher computational costs, or fail to effectively establish long-range channel dependencies, or neglect the extraction and utilization of structural information. This work introduces a novel module, the Multi-Branch Concatenation (MBC), designed to process input tensors and extract multi-scale feature maps. The MBC module introduces new degrees of freedom (DoF) in the design of attention networks by allowing for flexible adjustments to the types of transformation operators and the number of branches. This study considers two key transformation operators: multiplexing and splitting, both of which facilitate a more granular representation of multi-scale features and enhance the receptive field range. By integrating the MBC with an attention module, a Multi-Branch Attention (MBA) module is developed to capture channel-wise interactions within feature maps, thereby establishing long-range channel dependencies. Replacing the 3x3 convolutions in the bottleneck blocks of ResNet with the proposed MBA yields a new block, the Efficient Multi-Branch Attention (EMBA), which can be seamlessly integrated into state-of-the-art backbone CNN models. Furthermore, a new backbone network, named EMBANet, is constructed by stacking EMBA blocks. The proposed EMBANet has been thoroughly evaluated across various computer vision tasks, including classification, detection, and segmentation, consistently demonstrating superior performance compared to popular backbones.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107248"},"PeriodicalIF":6.3000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025001273","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advances in the design of convolutional neural networks have shown that performance can be enhanced by improving the ability to represent multi-scale features. However, most existing methods either focus on designing more sophisticated attention modules, which leads to higher computational costs, or fail to effectively establish long-range channel dependencies, or neglect the extraction and utilization of structural information. This work introduces a novel module, the Multi-Branch Concatenation (MBC), designed to process input tensors and extract multi-scale feature maps. The MBC module introduces new degrees of freedom (DoF) in the design of attention networks by allowing for flexible adjustments to the types of transformation operators and the number of branches. This study considers two key transformation operators: multiplexing and splitting, both of which facilitate a more granular representation of multi-scale features and enhance the receptive field range. By integrating the MBC with an attention module, a Multi-Branch Attention (MBA) module is developed to capture channel-wise interactions within feature maps, thereby establishing long-range channel dependencies. Replacing the 3x3 convolutions in the bottleneck blocks of ResNet with the proposed MBA yields a new block, the Efficient Multi-Branch Attention (EMBA), which can be seamlessly integrated into state-of-the-art backbone CNN models. Furthermore, a new backbone network, named EMBANet, is constructed by stacking EMBA blocks. The proposed EMBANet has been thoroughly evaluated across various computer vision tasks, including classification, detection, and segmentation, consistently demonstrating superior performance compared to popular backbones.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
EMBANet:灵活高效的多分支关注网络
卷积神经网络设计的最新进展表明,可以通过提高表示多尺度特征的能力来增强其性能。然而,大多数现有方法要么侧重于设计更复杂的注意力模块,从而导致更高的计算成本,要么无法有效地建立远程通道依赖关系,要么忽略了结构信息的提取和利用。本工作引入了一个新的模块,多分支连接(MBC),旨在处理输入张量并提取多尺度特征图。MBC模块通过允许灵活调整转换运营商的类型和分支数量,在注意力网络设计中引入了新的自由度(DoF)。本研究考虑了两个关键的变换算子:复用和分割,它们都有助于更精细地表示多尺度特征并增强接收野范围。通过将MBC与注意模块集成,开发了多分支注意(MBA)模块,以捕获特征映射内的频道智能交互,从而建立远程频道依赖关系。用提出的MBA取代ResNet瓶颈块中的3x3卷积产生了一个新的块,即高效多分支注意(EMBA),它可以无缝集成到最先进的骨干CNN模型中。在此基础上,提出了一种新的骨干网——EMBANet。提出的EMBANet已经在各种计算机视觉任务中进行了全面评估,包括分类、检测和分割,与流行的主干相比,始终表现出优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
期刊最新文献
Beyond local aggregation: Global graph contrastive learning for multi-view fusion. SD2-ReID: A semantic-stylistic decoupled distillation framework for robust multi-modal object re-identification. TransUTD: Underwater cross-domain collaborative spatial-temporal transformer detector. Adversarial discriminant attack on text-to-image diffusion models. Enhancing out-of-distribution detection with bilateral distribution score.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1