An enhanced lightweight model for small-scale pedestrian detection based on YOLOv8s

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Digital Signal Processing Pub Date : 2024-11-10 DOI:10.1016/j.dsp.2024.104866
Feifei Zhang , Lee Vien Leong , Kin Sam Yen , Yana Zhang
{"title":"An enhanced lightweight model for small-scale pedestrian detection based on YOLOv8s","authors":"Feifei Zhang ,&nbsp;Lee Vien Leong ,&nbsp;Kin Sam Yen ,&nbsp;Yana Zhang","doi":"10.1016/j.dsp.2024.104866","DOIUrl":null,"url":null,"abstract":"<div><div>Autonomous vehicle scenarios often involve occluded and distant pedestrians, leading to missed and false detections or models that are too large to deploy. To address these issues, this study proposed a lightweight model based on Yolov8s. Feature extraction and fusion networks were redesigned to optimize the detection layer for better detection. The Backbone Network incorporated Dual Conv and ELAN to create the EDLAN module. The EDLAN module and optimized SPPF-LSKA improved the small-scale pedestrian feature extraction in complex backgrounds while reducing the parameters and computation. In Neck Network, BiFPN and VoVGSCSP enhance pedestrian features and improve detection. In addition, the WIoU loss function addressed the target imbalance to enhance generalization ability and overall performance. Enhanced Yolov8s was trained and validated using the CityPersons dataset. Compared to Yolov8s, it improved the precision, recall, F1 score, and mAP@50 by 5.2%, 7.2%, 6.8%, and 6.8%, respectively, while reducing the parameters by 68% and compressing the model size by 67%. The validation experiments were conducted on Caltech and BDD100K datasets. The result demonstrated that precision increased by 3.4% and 1.1%, the mAP@50 also increased by 7.6% and 2.8%, respectively. The modified model reduced the model parameters and size while effectively improving the detection accuracy, making it highly valuable for autonomous driving scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104866"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200424004901","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Autonomous vehicle scenarios often involve occluded and distant pedestrians, leading to missed and false detections or models that are too large to deploy. To address these issues, this study proposed a lightweight model based on Yolov8s. Feature extraction and fusion networks were redesigned to optimize the detection layer for better detection. The Backbone Network incorporated Dual Conv and ELAN to create the EDLAN module. The EDLAN module and optimized SPPF-LSKA improved the small-scale pedestrian feature extraction in complex backgrounds while reducing the parameters and computation. In Neck Network, BiFPN and VoVGSCSP enhance pedestrian features and improve detection. In addition, the WIoU loss function addressed the target imbalance to enhance generalization ability and overall performance. Enhanced Yolov8s was trained and validated using the CityPersons dataset. Compared to Yolov8s, it improved the precision, recall, F1 score, and mAP@50 by 5.2%, 7.2%, 6.8%, and 6.8%, respectively, while reducing the parameters by 68% and compressing the model size by 67%. The validation experiments were conducted on Caltech and BDD100K datasets. The result demonstrated that precision increased by 3.4% and 1.1%, the mAP@50 also increased by 7.6% and 2.8%, respectively. The modified model reduced the model parameters and size while effectively improving the detection accuracy, making it highly valuable for autonomous driving scenarios.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 YOLOv8s 的小规模行人检测增强型轻量级模型
在自动驾驶汽车的应用场景中,经常会出现行人被遮挡或距离较远的情况,从而导致漏检和误检,或者模型过于庞大而无法部署。为解决这些问题,本研究提出了基于 Yolov8s 的轻量级模型。对特征提取和融合网络进行了重新设计,以优化检测层,提高检测效果。骨干网络整合了 Dual Conv 和 ELAN,创建了 EDLAN 模块。EDLAN 模块和优化的 SPPF-LSKA 改进了复杂背景下的小范围行人特征提取,同时减少了参数和计算量。在 Neck 网络中,BiFPN 和 VoVGSCSP 增强了行人特征并改善了检测。此外,WIoU 损失函数解决了目标不平衡问题,增强了泛化能力和整体性能。使用 CityPersons 数据集对增强型 Yolov8s 进行了训练和验证。与 Yolov8s 相比,它的精确度、召回率、F1 分数和 mAP@50 分别提高了 5.2%、7.2%、6.8% 和 6.8%,同时参数减少了 68%,模型大小压缩了 67%。验证实验在 Caltech 和 BDD100K 数据集上进行。结果表明,精确度分别提高了 3.4% 和 1.1%,mAP@50 也分别提高了 7.6% 和 2.8%。改进后的模型减少了模型参数和体积,同时有效提高了检测精度,在自动驾驶场景中具有很高的应用价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
期刊最新文献
Adaptive polarimetric persymmetric detection for distributed subspace targets in lognormal texture clutter MFFR-net: Multi-scale feature fusion and attentive recalibration network for deep neural speech enhancement PV-YOLO: A lightweight pedestrian and vehicle detection model based on improved YOLOv8 Efficient recurrent real video restoration IGGCN: Individual-guided graph convolution network for pedestrian trajectory prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1