{"title":"Point-to-Point Regression: Accurate Infrared Small Target Detection With Single-Point Annotation","authors":"Rixiang Ni;Jing Wu;Zhaobing Qiu;Liqiong Chen;Changhai Luo;Feng Huang;Qiujiang Liu;Binxing Wang;Yunxiang Li;Youli Li","doi":"10.1109/TGRS.2025.3554025","DOIUrl":null,"url":null,"abstract":"Infrared small target detection (IRSTD) plays a vital role in various fields, especially in military early warning and maritime rescue. Its main goal is to accurately locate targets at long distances. Current deep learning (DL)-based methods mainly rely on mask-to-mask or box-to-box regression training approaches, making considerable progress in detection accuracy. However, these methods rely on large amounts of training data with expensive manual annotation. Although some researchers attempt to reduce the cost using single-point weak supervision (SPWS), the limited labeling accuracy significantly degrades the detection performance. To address these issues, we propose a novel point-to-point regression high-resolution dynamic network (P2P-HDNet), which can accurately locate the target center using only single-point annotation. Specifically, we first devise the high-resolution cross-feature extraction module (HCEM) to provide richer target detail information for the deep feature maps. Notably, HCEM maintains high resolution throughout the feature extraction process to minimize information loss. Then, the dynamic coordinate fusion module (DCFM) is devised to fully fuse the multidimensional features and enhance the positional sensitivity. Finally, we devise an adaptive target localization detection head (ATLDH) to further suppress clutter and improve the localization accuracy by regressing the Gaussian heatmap and adaptive nonmaximal suppression strategy. Extensive experimental results show that P2P-HDNet can achieve better detection accuracy than the state-of-the-art (SOTA) methods with only single-point annotation. In addition, our code and datasets will be available at: <uri>https://github.com/Anton-Nrx/P2P-HDNet</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-19"},"PeriodicalIF":8.6000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10937752/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared small target detection (IRSTD) plays a vital role in various fields, especially in military early warning and maritime rescue. Its main goal is to accurately locate targets at long distances. Current deep learning (DL)-based methods mainly rely on mask-to-mask or box-to-box regression training approaches, making considerable progress in detection accuracy. However, these methods rely on large amounts of training data with expensive manual annotation. Although some researchers attempt to reduce the cost using single-point weak supervision (SPWS), the limited labeling accuracy significantly degrades the detection performance. To address these issues, we propose a novel point-to-point regression high-resolution dynamic network (P2P-HDNet), which can accurately locate the target center using only single-point annotation. Specifically, we first devise the high-resolution cross-feature extraction module (HCEM) to provide richer target detail information for the deep feature maps. Notably, HCEM maintains high resolution throughout the feature extraction process to minimize information loss. Then, the dynamic coordinate fusion module (DCFM) is devised to fully fuse the multidimensional features and enhance the positional sensitivity. Finally, we devise an adaptive target localization detection head (ATLDH) to further suppress clutter and improve the localization accuracy by regressing the Gaussian heatmap and adaptive nonmaximal suppression strategy. Extensive experimental results show that P2P-HDNet can achieve better detection accuracy than the state-of-the-art (SOTA) methods with only single-point annotation. In addition, our code and datasets will be available at: https://github.com/Anton-Nrx/P2P-HDNet.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.