Bing Zhao, Aoran Guo, Ruitao Ma, Yanfei Zhang, Jinliang Gong
{"title":"YOLOv8s-CFB: a lightweight method for real-time detection of apple fruits in complex environments","authors":"Bing Zhao, Aoran Guo, Ruitao Ma, Yanfei Zhang, Jinliang Gong","doi":"10.1007/s11554-024-01543-4","DOIUrl":null,"url":null,"abstract":"<p>With the development of apple-picking robots, deep learning models have become essential in apple detection. However, current detection models are often disrupted by complex backgrounds, leading to low recognition accuracy and slow speeds in natural environments. To address these issues, this study proposes an improved model, YOLOv8s-CFB, based on YOLOv8s. This model introduces partial convolution (PConv) in the backbone network, enhances the C2f module, and forms a new architecture, CSPPC, to reduce computational complexity and improve speed. Additionally, FocalModulation technology replaces the original SPPF module to enhance the model’s ability to recognize key areas. Finally, the bidirectional feature pyramid (BiFPN) is introduced to adaptively learn the importance of weights at each scale, effectively retaining multi-scale information through a bidirectional context information transmission mechanism, and improving the model’s detection ability for occluded targets. Test results show that the improved YOLOv8 network achieves better detection performance, with an average accuracy of 93.86%, a parameter volume of 8.83 M, and a detection time of 0.7 ms. The improved algorithm achieves high detection accuracy with a small weight file, making it suitable for deployment on mobile devices. Therefore, the improved model can efficiently and accurately detect apples in complex orchard environments in real time.</p>","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"31 4 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Real-Time Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11554-024-01543-4","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of apple-picking robots, deep learning models have become essential in apple detection. However, current detection models are often disrupted by complex backgrounds, leading to low recognition accuracy and slow speeds in natural environments. To address these issues, this study proposes an improved model, YOLOv8s-CFB, based on YOLOv8s. This model introduces partial convolution (PConv) in the backbone network, enhances the C2f module, and forms a new architecture, CSPPC, to reduce computational complexity and improve speed. Additionally, FocalModulation technology replaces the original SPPF module to enhance the model’s ability to recognize key areas. Finally, the bidirectional feature pyramid (BiFPN) is introduced to adaptively learn the importance of weights at each scale, effectively retaining multi-scale information through a bidirectional context information transmission mechanism, and improving the model’s detection ability for occluded targets. Test results show that the improved YOLOv8 network achieves better detection performance, with an average accuracy of 93.86%, a parameter volume of 8.83 M, and a detection time of 0.7 ms. The improved algorithm achieves high detection accuracy with a small weight file, making it suitable for deployment on mobile devices. Therefore, the improved model can efficiently and accurately detect apples in complex orchard environments in real time.
期刊介绍:
Due to rapid advancements in integrated circuit technology, the rich theoretical results that have been developed by the image and video processing research community are now being increasingly applied in practical systems to solve real-world image and video processing problems. Such systems involve constraints placed not only on their size, cost, and power consumption, but also on the timeliness of the image data processed.
Examples of such systems are mobile phones, digital still/video/cell-phone cameras, portable media players, personal digital assistants, high-definition television, video surveillance systems, industrial visual inspection systems, medical imaging devices, vision-guided autonomous robots, spectral imaging systems, and many other real-time embedded systems. In these real-time systems, strict timing requirements demand that results are available within a certain interval of time as imposed by the application.
It is often the case that an image processing algorithm is developed and proven theoretically sound, presumably with a specific application in mind, but its practical applications and the detailed steps, methodology, and trade-off analysis required to achieve its real-time performance are not fully explored, leaving these critical and usually non-trivial issues for those wishing to employ the algorithm in a real-time system.
The Journal of Real-Time Image Processing is intended to bridge the gap between the theory and practice of image processing, serving the greater community of researchers, practicing engineers, and industrial professionals who deal with designing, implementing or utilizing image processing systems which must satisfy real-time design constraints.