{"title":"YOLOR-Stem: Gaussian rotating bounding boxes and probability similarity measure for enhanced tomato main stem detection","authors":"Guohua Gao, Lifa Fang, Zihua Zhang, Jiahao Li","doi":"10.1016/j.compag.2025.110192","DOIUrl":null,"url":null,"abstract":"<div><div>The tomato is a widely cultivated solanaceous vegetable worldwide and plays a crucial role in meeting human nutritional requirements. Non-invasive, time-dynamic automated representation and analysis of tomato main stems is critical for autonomous monitoring of canopy morphology throughout the entire tomato growth management cycle. Plant growth is influenced by genotype and environment, making naturally curved main stems and mutual shading of the branches and leaves, combined with the limited camera field of view and horizontal camera movement along crop rows, the sensing system observes only discontinuous and curved segments of the main stems. This study proposes an end-to-end YOLOR-Stem approach by optimizing the core components of YOLO v8. First, an innovative method for segmental labelling of main stems using multiple rotating bounding boxes is defined to ensure a precise description. Second, additional angular regression parameters are introduced to capture the orientation and scale information of main stem segments at any angle, overcoming the limitations of horizontal bounding boxes in unstructured field environments. Finally, the Hellinger distance measure is used to quantify the similarity between the predicted and ground truth distributions, integrated into the positive and negative sample matching strategy, loss function computation for rotated bounding boxes, and the prediction box screening during non-maximum suppression. The experimental results demonstrated that YOLOR-Stem (input size of 960 × 960 pixels) with the backbone of EfficientViT-M1 achieved 91.90 % mAP@50, 9.75 M parameters, 35.5GFLOPs, and 10.06 ms inference time. This study enables fast and accurate detection of visible segments of tomato plants, which lays the foundation for intelligent robot-plant interactions such as high-throughput phenotyping, branch and leaf pruning, growth detection, and autonomous harvesting.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"233 ","pages":"Article 110192"},"PeriodicalIF":7.7000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925002984","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The tomato is a widely cultivated solanaceous vegetable worldwide and plays a crucial role in meeting human nutritional requirements. Non-invasive, time-dynamic automated representation and analysis of tomato main stems is critical for autonomous monitoring of canopy morphology throughout the entire tomato growth management cycle. Plant growth is influenced by genotype and environment, making naturally curved main stems and mutual shading of the branches and leaves, combined with the limited camera field of view and horizontal camera movement along crop rows, the sensing system observes only discontinuous and curved segments of the main stems. This study proposes an end-to-end YOLOR-Stem approach by optimizing the core components of YOLO v8. First, an innovative method for segmental labelling of main stems using multiple rotating bounding boxes is defined to ensure a precise description. Second, additional angular regression parameters are introduced to capture the orientation and scale information of main stem segments at any angle, overcoming the limitations of horizontal bounding boxes in unstructured field environments. Finally, the Hellinger distance measure is used to quantify the similarity between the predicted and ground truth distributions, integrated into the positive and negative sample matching strategy, loss function computation for rotated bounding boxes, and the prediction box screening during non-maximum suppression. The experimental results demonstrated that YOLOR-Stem (input size of 960 × 960 pixels) with the backbone of EfficientViT-M1 achieved 91.90 % mAP@50, 9.75 M parameters, 35.5GFLOPs, and 10.06 ms inference time. This study enables fast and accurate detection of visible segments of tomato plants, which lays the foundation for intelligent robot-plant interactions such as high-throughput phenotyping, branch and leaf pruning, growth detection, and autonomous harvesting.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.