Xiyu Zhong;Jialei Zhan;Yuhang Xie;Lingtao Zhang;Guoxiong Zhou;Mingyue Liang;Kaitai Yang;Zonghao Guo;Liujun Li
{"title":"Adaptive Deformation-Learning and Multiscale-Integrated Network for Remote Sensing Object Detection","authors":"Xiyu Zhong;Jialei Zhan;Yuhang Xie;Lingtao Zhang;Guoxiong Zhou;Mingyue Liang;Kaitai Yang;Zonghao Guo;Liujun Li","doi":"10.1109/TGRS.2025.3541441","DOIUrl":null,"url":null,"abstract":"Modern human productivity and daily life rely on identifying ground objects using remote sensing images (RSIs). Traditional remote sensing object detection (RSOD) techniques lack timeliness and accuracy and fail to meet practical demands. Existing deep learning algorithms face continued challenges when processing RSIs because of the diverse shapes and extensive scale variations of objects, of which a significant proportion are small scale. To address these challenges, we propose the PSWP-DETR, a transformer-based network that leverages adaptive deformation learning and multiscale integration for enhanced object detection in remote sensing. First, we propose PradatorConv (PdConv) to address the significant shape changes of objects because it adaptively learns the horizontal and vertical deformations to perceive the complex geometric features of RSIs. Second, we propose scale-wise differential modules (SDMs), which comprise multiscale convolution (MSC) and edge captor convolution (ECC). SDM integrates features across various scales and captures edge characteristics and local textures. This is advantageous for detecting multiscale objects, tiny objects with limited feature information. Finally, we propose the whale particle optimization (WPO) algorithm for learning rate optimization, which improves convergence speed and accuracy. Experiments using the VisDrone2019-DET, DIOR, and AI-TOD datasets demonstrated that PSWP-DETR achieves the best accuracy benefits, offering significant insights for future RSOD efforts. The source code will be available at <uri>https://github.com/Get1star/PSWP-DETR.git</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-19"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10887248/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Modern human productivity and daily life rely on identifying ground objects using remote sensing images (RSIs). Traditional remote sensing object detection (RSOD) techniques lack timeliness and accuracy and fail to meet practical demands. Existing deep learning algorithms face continued challenges when processing RSIs because of the diverse shapes and extensive scale variations of objects, of which a significant proportion are small scale. To address these challenges, we propose the PSWP-DETR, a transformer-based network that leverages adaptive deformation learning and multiscale integration for enhanced object detection in remote sensing. First, we propose PradatorConv (PdConv) to address the significant shape changes of objects because it adaptively learns the horizontal and vertical deformations to perceive the complex geometric features of RSIs. Second, we propose scale-wise differential modules (SDMs), which comprise multiscale convolution (MSC) and edge captor convolution (ECC). SDM integrates features across various scales and captures edge characteristics and local textures. This is advantageous for detecting multiscale objects, tiny objects with limited feature information. Finally, we propose the whale particle optimization (WPO) algorithm for learning rate optimization, which improves convergence speed and accuracy. Experiments using the VisDrone2019-DET, DIOR, and AI-TOD datasets demonstrated that PSWP-DETR achieves the best accuracy benefits, offering significant insights for future RSOD efforts. The source code will be available at https://github.com/Get1star/PSWP-DETR.git.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.