SOD-YOLOv10: Small Object Detection in Remote Sensing Images Based on YOLOv10

Hui Sun;Guangzhen Yao;Sandong Zhu;Long Zhang;Hui Xu;Jun Kong
{"title":"SOD-YOLOv10: Small Object Detection in Remote Sensing Images Based on YOLOv10","authors":"Hui Sun;Guangzhen Yao;Sandong Zhu;Long Zhang;Hui Xu;Jun Kong","doi":"10.1109/LGRS.2025.3534786","DOIUrl":null,"url":null,"abstract":"YOLOv10, known for its efficiency in object detection methods, quickly and accurately detects objects in images. However, when detecting small objects in remote sensing imagery, traditional algorithms often encounter challenges like background noise, missing information, and complex multiobject interactions, which can affect detection performance. To address these issues, we propose an enhanced algorithm for detecting small objects, named SOD-YOLOv10. We design the Multidimensional Information Interaction for the Transformer Backbone (TransBone) Network, which enhances global perception capabilities and effectively integrates both local and global information, thereby improving the detection of small object features. We also propose a feature fusion technology using an attention mechanism, called aggregated attention in a gated feature pyramid network (AA-GFPN). This technology uses an efficient feature aggregation network and re-parameterization techniques to optimize information interaction between feature maps of different scales. Additionally, by incorporating the aggregated attention (AA) mechanism, it accurately identifies essential features of small objects. Moreover, we propose the adaptive focal powerful IoU (AFP-IoU) loss function, which not only prevents excessive expansion of the anchor box area but also significantly accelerates model convergence. To evaluate our method, we conduct thorough tests on the RSOD, NWPU VHR-10, VisDrone2019, and AI-TOD datasets. The findings indicate that our SOD-YOLOv10 model attains 95.90%, 92.46%, 55.61%, and 59.47% for mAP@0.5 and 73.42%, 66.84%, 39.03%, and 42.67% for mAP@0.5:0.95.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10855585/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

YOLOv10, known for its efficiency in object detection methods, quickly and accurately detects objects in images. However, when detecting small objects in remote sensing imagery, traditional algorithms often encounter challenges like background noise, missing information, and complex multiobject interactions, which can affect detection performance. To address these issues, we propose an enhanced algorithm for detecting small objects, named SOD-YOLOv10. We design the Multidimensional Information Interaction for the Transformer Backbone (TransBone) Network, which enhances global perception capabilities and effectively integrates both local and global information, thereby improving the detection of small object features. We also propose a feature fusion technology using an attention mechanism, called aggregated attention in a gated feature pyramid network (AA-GFPN). This technology uses an efficient feature aggregation network and re-parameterization techniques to optimize information interaction between feature maps of different scales. Additionally, by incorporating the aggregated attention (AA) mechanism, it accurately identifies essential features of small objects. Moreover, we propose the adaptive focal powerful IoU (AFP-IoU) loss function, which not only prevents excessive expansion of the anchor box area but also significantly accelerates model convergence. To evaluate our method, we conduct thorough tests on the RSOD, NWPU VHR-10, VisDrone2019, and AI-TOD datasets. The findings indicate that our SOD-YOLOv10 model attains 95.90%, 92.46%, 55.61%, and 59.47% for mAP@0.5 and 73.42%, 66.84%, 39.03%, and 42.67% for mAP@0.5:0.95.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于YOLOv10的遥感图像小目标检测
YOLOv10以其在物体检测方法中的效率而闻名,可以快速准确地检测图像中的物体。然而,传统算法在检测遥感图像中的小目标时,往往会遇到背景噪声、信息缺失、复杂的多目标交互等挑战,影响检测性能。为了解决这些问题,我们提出了一种用于检测小物体的增强算法,命名为SOD-YOLOv10。本文设计了变压器骨干网的多维信息交互,增强了全局感知能力,有效地整合了局部和全局信息,从而提高了小目标特征的检测。我们还提出了一种使用注意力机制的特征融合技术,称为门控特征金字塔网络(AA-GFPN)中的聚合注意力。该技术利用高效的特征聚合网络和重参数化技术优化不同尺度特征图之间的信息交互。此外,通过结合聚合注意(AA)机制,它可以准确地识别小对象的基本特征。此外,我们还提出了自适应焦点强大IoU (AFP-IoU)损失函数,该函数不仅防止了锚盒区域的过度扩展,而且显著加快了模型的收敛速度。为了评估我们的方法,我们在RSOD、NWPU VHR-10、VisDrone2019和AI-TOD数据集上进行了全面的测试。结果表明,我们的SOD-YOLOv10模型对mAP@0.5的准确率分别为95.90%、92.46%、55.61%、59.47%,对mAP@0.5的准确率分别为73.42%、66.84%、39.03%、42.67%:0.95。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Improved Ground-Based GNSS-R Soil Moisture Retrieval Algorithm Incorporating Precipitation Effects MCD-YOLO: An Improved YOLOv11 Framework for Manhole Cover Detection in UAV Imagery Robust Recognition of Anomalous Distribution From Electrical Resistivity Tomography Dip-Guided Poststack Inversion via Structure-Tensor Regularization IEEE Geoscience and Remote Sensing Letters Institutional Listings
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1