{"title":"CSI-Net: CNN Swin Transformer Integrated Network for Infrared Small Target Detection","authors":"Lammi Choi, Won Young Chung, Chan Gook Park","doi":"10.1007/s12555-024-0089-8","DOIUrl":null,"url":null,"abstract":"<p>In the realm of infrared (IR) small target detection, pinpointing blurry and low-contrast targets accurately is immensely challenging due to the intricate features of IR images. To tackle this, we introduce CSI-Net, a novel network architecture merging CNN and swin transformer. CSI-Net features a hybrid encoder design, blending encoder-decoder layout of UNet with swin transformer’s parallel execution alongside CNN. This amalgamation enables the network to capture local features and long-distance dependencies, enhancing its ability to accurately identify small targets. Leveraging hierarchical features of swin transformer, CSI-Net adeptly grasps contextual information crucial for small target detection. Moreover, CSI-Net employs full-scale skip connections over encoder-decoder and decoder-decoder, integrating multiscale CNN and swin transformer features to improve gradient propagation. Experimental results validate superiority of proposed method over traditional CNN and Transformer methods. At NUAA-SIRST, metrics like mIoU (0.7483), detection probability (0.9734), and false alarm rates (0.101 × 10<sup>−5</sup>) demonstrate significant improvement. Similarly, at NUDT-SIRST, values like mIoU (0.8887), detection probability (0.9894), and false alarm rates (0.431 × 10<sup>−5</sup>) show notable enhancement. The performance of network scales with dataset size, and its robustness is affirmed by the area under the ROC curve (AUC). Additionally, an ablation study validates the efficacy of hybrid encoder. Varying the presence of the parallel swin transformer module (PSM) reveals that its application enhances small target detection performance. The comprehensive evaluation shows that the swin transformer-enhanced UNet architecture effectively tackles the challenges of IR small target detection.</p>","PeriodicalId":54965,"journal":{"name":"International Journal of Control Automation and Systems","volume":"26 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Control Automation and Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12555-024-0089-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of infrared (IR) small target detection, pinpointing blurry and low-contrast targets accurately is immensely challenging due to the intricate features of IR images. To tackle this, we introduce CSI-Net, a novel network architecture merging CNN and swin transformer. CSI-Net features a hybrid encoder design, blending encoder-decoder layout of UNet with swin transformer’s parallel execution alongside CNN. This amalgamation enables the network to capture local features and long-distance dependencies, enhancing its ability to accurately identify small targets. Leveraging hierarchical features of swin transformer, CSI-Net adeptly grasps contextual information crucial for small target detection. Moreover, CSI-Net employs full-scale skip connections over encoder-decoder and decoder-decoder, integrating multiscale CNN and swin transformer features to improve gradient propagation. Experimental results validate superiority of proposed method over traditional CNN and Transformer methods. At NUAA-SIRST, metrics like mIoU (0.7483), detection probability (0.9734), and false alarm rates (0.101 × 10−5) demonstrate significant improvement. Similarly, at NUDT-SIRST, values like mIoU (0.8887), detection probability (0.9894), and false alarm rates (0.431 × 10−5) show notable enhancement. The performance of network scales with dataset size, and its robustness is affirmed by the area under the ROC curve (AUC). Additionally, an ablation study validates the efficacy of hybrid encoder. Varying the presence of the parallel swin transformer module (PSM) reveals that its application enhances small target detection performance. The comprehensive evaluation shows that the swin transformer-enhanced UNet architecture effectively tackles the challenges of IR small target detection.
期刊介绍:
International Journal of Control, Automation and Systems is a joint publication of the Institute of Control, Robotics and Systems (ICROS) and the Korean Institute of Electrical Engineers (KIEE).
The journal covers three closly-related research areas including control, automation, and systems.
The technical areas include
Control Theory
Control Applications
Robotics and Automation
Intelligent and Information Systems
The Journal addresses research areas focused on control, automation, and systems in electrical, mechanical, aerospace, chemical, and industrial engineering in order to create a strong synergy effect throughout the interdisciplinary research areas.