Rosa Pia Devanna , Laura Romeo , Giulio Reina , Annalisa Milella
{"title":"Yield estimation in precision viticulture by combining deep segmentation and depth-based clustering","authors":"Rosa Pia Devanna , Laura Romeo , Giulio Reina , Annalisa Milella","doi":"10.1016/j.compag.2025.110025","DOIUrl":null,"url":null,"abstract":"<div><div>Grapevine phenotyping, that is the process of determining the physical properties (e.g., size, shape, and number) of grape bunches, provides valuable information for growth and health monitoring, yield estimation and efficient crop management in precision viticulture. Currently, grape bunch counting and sizing is done manually, which is labor intensive and often impractical for large-scale field applications. This paper describes a novel framework to automatically detect, count and estimate the volume/weight of grape bunches using RGB and depth data acquired in the field by a farmer robot. The proposed pipeline starts with the semantic segmentation of RGB images based on a pre-trained MANet architecture with EfficientnetB3 backbone to separate fruit from non-fruit regions. The segmented fruit mask is then projected onto the co-registered depth image to recover a depth mask, allowing for three-dimensional (3D) data association. After a pre-processing step to correct anomalies, such as corrupted and missing values, and to remove outliers, a depth gradient-based clustering algorithm is applied that detects individual grape bunch clusters. This enables the separation of adjacent and partially overlapping bunches. In addition, a method to reconstruct the whole 3D shape of a bunch is introduced, so as to provide an estimate of volume and weight. Experiments performed in a commercial vineyard in Italy are presented showing that, despite the low quality and high variability of the input images, the proposed approach is able to count grape bunch clusters with an average error of about 12% with respect to visual ground-truth and an average error less than 30% with respect to manual weight measurements. It is also shown that the processing framework can be applied to geo-referenced image sequences acquired by the farmer robot while traversing vineyard rows, thus providing an automated pipeline for the generation of high-resolution yield maps for precision viticulture applications.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110025"},"PeriodicalIF":7.7000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925001310","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Grapevine phenotyping, that is the process of determining the physical properties (e.g., size, shape, and number) of grape bunches, provides valuable information for growth and health monitoring, yield estimation and efficient crop management in precision viticulture. Currently, grape bunch counting and sizing is done manually, which is labor intensive and often impractical for large-scale field applications. This paper describes a novel framework to automatically detect, count and estimate the volume/weight of grape bunches using RGB and depth data acquired in the field by a farmer robot. The proposed pipeline starts with the semantic segmentation of RGB images based on a pre-trained MANet architecture with EfficientnetB3 backbone to separate fruit from non-fruit regions. The segmented fruit mask is then projected onto the co-registered depth image to recover a depth mask, allowing for three-dimensional (3D) data association. After a pre-processing step to correct anomalies, such as corrupted and missing values, and to remove outliers, a depth gradient-based clustering algorithm is applied that detects individual grape bunch clusters. This enables the separation of adjacent and partially overlapping bunches. In addition, a method to reconstruct the whole 3D shape of a bunch is introduced, so as to provide an estimate of volume and weight. Experiments performed in a commercial vineyard in Italy are presented showing that, despite the low quality and high variability of the input images, the proposed approach is able to count grape bunch clusters with an average error of about 12% with respect to visual ground-truth and an average error less than 30% with respect to manual weight measurements. It is also shown that the processing framework can be applied to geo-referenced image sequences acquired by the farmer robot while traversing vineyard rows, thus providing an automated pipeline for the generation of high-resolution yield maps for precision viticulture applications.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.