{"title":"EAFF-Net:用于双模态行人检测的高效注意力特征融合网络","authors":"Ying Shen, Xiaoyang Xie, Jing Wu, Liqiong Chen, Feng Huang","doi":"10.1016/j.infrared.2024.105696","DOIUrl":null,"url":null,"abstract":"<div><div>The pedestrian detection network utilizing a combination of infrared and visible image pairs can improve detection accuracy by fusing their complementary information, especially in challenging illumination conditions. However, most existing dual-modality methods only focus on the effectiveness of feature maps between different modalities while neglecting the issue of redundant information in the modalities. This oversight often affects the detection performance in low illumination conditions. This paper proposes an efficient attention feature fusion network (EAFF-Net), which suppresses redundant information and enhances the fusion of features from dual-modality images. Firstly, we design a dual-backbone network based on CSPDarknet53 and combine with an efficient partial spatial pyramid pooling module (EPSPPM), improving the efficiency of feature extraction in different modalities. Secondly, a feature attention fusion module (FAFM) is built to adaptively weaken modal redundant information to improve the fusion effect of features. Finally, a deep attention pyramid module (DAPM) is proposed to cascade multi-scale feature information and obtain more detailed features of small targets. The effectiveness of EAFF-Net in pedestrian detection has been demonstrated through experiments conducted on two public datasets.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"145 ","pages":"Article 105696"},"PeriodicalIF":3.4000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EAFF-Net: Efficient attention feature fusion network for dual-modality pedestrian detection\",\"authors\":\"Ying Shen, Xiaoyang Xie, Jing Wu, Liqiong Chen, Feng Huang\",\"doi\":\"10.1016/j.infrared.2024.105696\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The pedestrian detection network utilizing a combination of infrared and visible image pairs can improve detection accuracy by fusing their complementary information, especially in challenging illumination conditions. However, most existing dual-modality methods only focus on the effectiveness of feature maps between different modalities while neglecting the issue of redundant information in the modalities. This oversight often affects the detection performance in low illumination conditions. This paper proposes an efficient attention feature fusion network (EAFF-Net), which suppresses redundant information and enhances the fusion of features from dual-modality images. Firstly, we design a dual-backbone network based on CSPDarknet53 and combine with an efficient partial spatial pyramid pooling module (EPSPPM), improving the efficiency of feature extraction in different modalities. Secondly, a feature attention fusion module (FAFM) is built to adaptively weaken modal redundant information to improve the fusion effect of features. Finally, a deep attention pyramid module (DAPM) is proposed to cascade multi-scale feature information and obtain more detailed features of small targets. The effectiveness of EAFF-Net in pedestrian detection has been demonstrated through experiments conducted on two public datasets.</div></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":\"145 \",\"pages\":\"Article 105696\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449524005802\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524005802","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
EAFF-Net: Efficient attention feature fusion network for dual-modality pedestrian detection
The pedestrian detection network utilizing a combination of infrared and visible image pairs can improve detection accuracy by fusing their complementary information, especially in challenging illumination conditions. However, most existing dual-modality methods only focus on the effectiveness of feature maps between different modalities while neglecting the issue of redundant information in the modalities. This oversight often affects the detection performance in low illumination conditions. This paper proposes an efficient attention feature fusion network (EAFF-Net), which suppresses redundant information and enhances the fusion of features from dual-modality images. Firstly, we design a dual-backbone network based on CSPDarknet53 and combine with an efficient partial spatial pyramid pooling module (EPSPPM), improving the efficiency of feature extraction in different modalities. Secondly, a feature attention fusion module (FAFM) is built to adaptively weaken modal redundant information to improve the fusion effect of features. Finally, a deep attention pyramid module (DAPM) is proposed to cascade multi-scale feature information and obtain more detailed features of small targets. The effectiveness of EAFF-Net in pedestrian detection has been demonstrated through experiments conducted on two public datasets.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.