{"title":"Lightweight Low-Altitude UAV Object Detection Based on Improved YOLOv5s","authors":"Haokai Zeng, Jing Li, Liping Qu","doi":"10.2478/ijanmc-2024-0009","DOIUrl":null,"url":null,"abstract":"\n In the context of rapid developments in drone technology, the significance of recognizing and detecting low-altitude unmanned aerial vehicles (UAVs) has grown. Although conventional algorithmic enhancements have increased the detection rate of low-altitude UAV targets, they tend to neglect the intricate nature and computational demands of the algorithms. This paper introduces ATD-YOLO, an enhanced target detection model based on the YOLOv5s architecture, aimed at tackling this issue. Firstly, a realistic low-altitude UAV dataset is fashioned by amalgamating various publicly available datasets. Secondly, a C3F module grounded in FasterNet, incorporating Partial Convolution (PConv), is introduced to decrease model parameters while upholding detection accuracy. Furthermore, the backbone network incorporates an Efficient Multi-Scale Attention (EMA) module to extract essential image information while filtering out irrelevant details, facilitating adaptive feature fusion. Additionally, the universal upsampling operator CARAFE (Content-aware reassembly of features) is utilized instead of nearest-neighbor upsampling. This enhancement boosts the performance of the feature pyramid network by expanding the receptive field for data feature fusion. Lastly, the Slim-Neck network is introduced to fine-tune the feature fusion network, thereby reducing the model’s floating-point calculations and parameters. Experimental findings demonstrate that the improved ATD-YOLO model achieves an accuracy of 92.8%, with a 31.4% decrease in parameters and a 28.7% decrease in floating-point calculations compared to the original model. The detection speed reaches 75.37 frames per second (FPS). These experiments affirm that the proposed enhancement method meets the deployment requirements for low computational power while maintaining high precision.","PeriodicalId":193299,"journal":{"name":"International Journal of Advanced Network, Monitoring and Controls","volume":"15 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Network, Monitoring and Controls","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/ijanmc-2024-0009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the context of rapid developments in drone technology, the significance of recognizing and detecting low-altitude unmanned aerial vehicles (UAVs) has grown. Although conventional algorithmic enhancements have increased the detection rate of low-altitude UAV targets, they tend to neglect the intricate nature and computational demands of the algorithms. This paper introduces ATD-YOLO, an enhanced target detection model based on the YOLOv5s architecture, aimed at tackling this issue. Firstly, a realistic low-altitude UAV dataset is fashioned by amalgamating various publicly available datasets. Secondly, a C3F module grounded in FasterNet, incorporating Partial Convolution (PConv), is introduced to decrease model parameters while upholding detection accuracy. Furthermore, the backbone network incorporates an Efficient Multi-Scale Attention (EMA) module to extract essential image information while filtering out irrelevant details, facilitating adaptive feature fusion. Additionally, the universal upsampling operator CARAFE (Content-aware reassembly of features) is utilized instead of nearest-neighbor upsampling. This enhancement boosts the performance of the feature pyramid network by expanding the receptive field for data feature fusion. Lastly, the Slim-Neck network is introduced to fine-tune the feature fusion network, thereby reducing the model’s floating-point calculations and parameters. Experimental findings demonstrate that the improved ATD-YOLO model achieves an accuracy of 92.8%, with a 31.4% decrease in parameters and a 28.7% decrease in floating-point calculations compared to the original model. The detection speed reaches 75.37 frames per second (FPS). These experiments affirm that the proposed enhancement method meets the deployment requirements for low computational power while maintaining high precision.