Using robotic graspers to harvest fruits and vegetables is a significant advancement in smart agriculture. However, the inherent fragility and varied shapes of many fruits and vegetables pose substantial challenges in achieving adaptive, non-destructive grasping and harvesting with robotic graspers. Grasping motion control and force uniformity control for different objects are essential for achieving non-destructive grasping and harvesting. Firstly, the working principle of the grasper is presented, along with the design of the joint self-locking and unlocking mechanism. Secondly, the grasping contact force during the movement of the grasper knuckle unit is analyzed. Then, a method is proposed to control the stopping of grasper movement through a binary code feedback signal, significantly reducing both the complexity of controlling the grasper and the potential for damage to the object. Building upon this foundation, a novel method for non-destructive grasping motion control is introduced. Finally, the grasping motion control system is developed based on the above theory, and experiments on the adaptive grasping of various fruits and vegetables as well as knuckle motion control are conducted. The experiments show that the grasper can adaptively and non-destructively grasp various shapes and types of fruits and vegetables, effectively solving the problem that the end-effector cannot grasp the fruits or cause damage to the fruits. The work in this paper provides a solution for the realization of intelligent fruit picking by robotic grasper.
{"title":"Analysis and realization of a self-adaptive grasper grasping for non-destructive picking of fruits and vegetables","authors":"Haibo Huang , Rugui Wang , Fuqiang Huang , Jianneng Chen","doi":"10.1016/j.compag.2025.110119","DOIUrl":"10.1016/j.compag.2025.110119","url":null,"abstract":"<div><div>Using robotic graspers to harvest fruits and vegetables is a significant advancement in smart agriculture. However, the inherent fragility and varied shapes of many fruits and vegetables pose substantial challenges in achieving adaptive, non-destructive grasping and harvesting with robotic graspers. Grasping motion control and force uniformity control for different objects are essential for achieving non-destructive grasping and harvesting. Firstly, the working principle of the grasper is presented, along with the design of the joint self-locking and unlocking mechanism. Secondly, the grasping contact force during the movement of the grasper knuckle unit is analyzed. Then, a method is proposed to control the stopping of grasper movement through a binary code feedback signal, significantly reducing both the complexity of controlling the grasper and the potential for damage to the object. Building upon this foundation, a novel method for non-destructive grasping motion control is introduced. Finally, the grasping motion control system is developed based on the above theory, and experiments on the adaptive grasping of various fruits and vegetables as well as knuckle motion control are conducted. The experiments show that the grasper can adaptively and non-destructively grasp various shapes and types of fruits and vegetables, effectively solving the problem that the end-effector cannot grasp the fruits or cause damage to the fruits. The work in this paper provides a solution for the realization of intelligent fruit picking by robotic grasper.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110119"},"PeriodicalIF":7.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143436401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1016/j.compag.2025.110102
Lidia M. Ortega-Alvarado, Juan Carlos Fernández-Pérez, David Jurado-Rodríquez
Hyperspectral sensors are revolutionizing precision agriculture by capturing spectral responses across a broad range of the spectrum. Furthermore, the integration of these data with three-dimensional information is increasingly crucial for generating enriched spatial and spectral datasets, enabling the application of more sophisticated analytical techniques. However, the analysis of this information lacks agility, and its visualization and interactive exploration are not integrated within a unified framework. The high dimensionality and volume of the fused information pose significant computational and visualization constraints for real-time processing. This paper presents the methodologies that makes possible advanced analysis of hyperspectral data fused with 3D point clouds to achieve advanced analysis. We introduce GEU, a novel interactive framework, which facilitates real-time interaction, visualization and spectral and spatial analysis. To achieve efficient handling of the fused data, this approach leverages meshlets, implemented directly on GPUs, for optimized spatial data management. A parallel data structure, termed Meanlet, representing the average spectral behavior of these spatial clusters, is maintained in main memory. The results in an integrated framework enabling real-time visualization, interaction, and analysis of hyperspectral data, including spectral information fused with 3D point clouds.
{"title":"Meshlets based data model for real-time interaction and analysis with hyper-spectral vegetation data","authors":"Lidia M. Ortega-Alvarado, Juan Carlos Fernández-Pérez, David Jurado-Rodríquez","doi":"10.1016/j.compag.2025.110102","DOIUrl":"10.1016/j.compag.2025.110102","url":null,"abstract":"<div><div>Hyperspectral sensors are revolutionizing precision agriculture by capturing spectral responses across a broad range of the spectrum. Furthermore, the integration of these data with three-dimensional information is increasingly crucial for generating enriched spatial and spectral datasets, enabling the application of more sophisticated analytical techniques. However, the analysis of this information lacks agility, and its visualization and interactive exploration are not integrated within a unified framework. The high dimensionality and volume of the fused information pose significant computational and visualization constraints for real-time processing. This paper presents the methodologies that makes possible advanced analysis of hyperspectral data fused with 3D point clouds to achieve advanced analysis. We introduce GEU, a novel interactive framework, which facilitates real-time interaction, visualization and spectral and spatial analysis. To achieve efficient handling of the fused data, this approach leverages meshlets, implemented directly on GPUs, for optimized spatial data management. A parallel data structure, termed Meanlet, representing the average spectral behavior of these spatial clusters, is maintained in main memory. The results in an integrated framework enabling real-time visualization, interaction, and analysis of hyperspectral data, including spectral information fused with 3D point clouds.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110102"},"PeriodicalIF":7.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.compag.2025.110129
Pu-Yun Kow , Yun-Ting Wang , Yu-Wen Chang , Meng-Hsin Lee , Ming-Hwi Yao , Li-Chiu Chang , Fi-John Chang
Artificial Intelligence (AI) is reshaping agriculture by driving smarter, data-driven practices that enhance regional weather forecasting and support proactive, informed decision-making. Advances in Big Data, IoT, Remote Sensing, and Machine Learning are accelerating this transformation, with Transformer architectures increasingly pivotal in refining agricultural management strategies, especially in Taiwan. In this study, we develop a hybrid Convolutional Autoencoder and LSTM-based Transformer Network (CAE-LSTMT) to downscale six-hour simulation data into precise hourly forecasts, validated using 55,538 temperature and relative humidity records (2020–2023) from Taiwan’s Jhuoshuei River basin, provided by the Central Weather Administration (CWA). The model was trained (70 %), validated (10 %), and tested (20 %) to optimize its configuration and performance. This CAE-LSTMT model substantially enhances spatiotemporal weather forecast resolution, transforming six-hour regional data into hourly forecasts with improved accuracy. It yields temperature forecast gains of 5.66 % to 20.39 % and relative humidity improvements of 8.05 % to 12.76 %, with reduced forecast biases compared to traditional LSTM models. The model demonstrates exceptional accuracy in vapor pressure deficit (VPD) predictions, achieving mean absolute errors (MAE) between 0.15 to 0.21 kPa across regions and 0.16 to 0.20 kPa seasonally, significantly outperforming the CWA model. Accurate VPD forecasts allow farmers to manage irrigation and minimize crop stress, directly supporting plant health and yield optimization. For heat index classification, the model achieves up to 96 % ACCURACY, with mean absolute percentage errors (MAPE) of 4 % to 23 %, significantly exceeding the CWA model’s ACCURACY range of 35 % to 79 % and MAPE of 29 % to 70 %. This high precision in heat index forecasting empowers farmers to protect crops and livestock against heat stress. By extracting critical features from high-dimensional data, the CAE-LSTMT model advances environmental downscaling for multi-site, multi-horizon weather data, showing significant promise for Smart Agriculture and Health Advisory Systems. This approach offers precise, actionable forecasts, optimizing agricultural practices and reducing climate-related risks, underscoring its impact on sustainable agricultural and environmental management.
{"title":"AI-driven weather downscaling for smart agriculture using autoencoders and transformers","authors":"Pu-Yun Kow , Yun-Ting Wang , Yu-Wen Chang , Meng-Hsin Lee , Ming-Hwi Yao , Li-Chiu Chang , Fi-John Chang","doi":"10.1016/j.compag.2025.110129","DOIUrl":"10.1016/j.compag.2025.110129","url":null,"abstract":"<div><div>Artificial Intelligence (AI) is reshaping agriculture by driving smarter, data-driven practices that enhance regional weather forecasting and support proactive, informed decision-making. Advances in Big Data, IoT, Remote Sensing, and Machine Learning are accelerating this transformation, with Transformer architectures increasingly pivotal in refining agricultural management strategies, especially in Taiwan. In this study, we develop a hybrid Convolutional Autoencoder and LSTM-based Transformer Network (CAE-LSTMT) to downscale six-hour simulation data into precise hourly forecasts, validated using 55,538 temperature and relative humidity records (2020–2023) from Taiwan’s Jhuoshuei River basin, provided by the Central Weather Administration (CWA). The model was trained (70 %), validated (10 %), and tested (20 %) to optimize its configuration and performance. This CAE-LSTMT model substantially enhances spatiotemporal weather forecast resolution, transforming six-hour regional data into hourly forecasts with improved accuracy. It yields temperature forecast gains of 5.66 % to 20.39 % and relative humidity improvements of 8.05 % to 12.76 %, with reduced forecast biases compared to traditional LSTM models. The model demonstrates exceptional accuracy in vapor pressure deficit (VPD) predictions, achieving mean absolute errors (MAE) between 0.15 to 0.21 kPa across regions and 0.16 to 0.20 kPa seasonally, significantly outperforming the CWA model. Accurate VPD forecasts allow farmers to manage irrigation and minimize crop stress, directly supporting plant health and yield optimization. For heat index classification, the model achieves up to 96 % ACCURACY, with mean absolute percentage errors (MAPE) of 4 % to 23 %, significantly exceeding the CWA model’s ACCURACY range of 35 % to 79 % and MAPE of 29 % to 70 %. This high precision in heat index forecasting empowers farmers to protect crops and livestock against heat stress. By extracting critical features from high-dimensional data, the CAE-LSTMT model advances environmental downscaling for multi-site, multi-horizon weather data, showing significant promise for Smart Agriculture and Health Advisory Systems. This approach offers precise, actionable forecasts, optimizing agricultural practices and reducing climate-related risks, underscoring its impact on sustainable agricultural and environmental management.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110129"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.compag.2025.110124
Chao Ban , Lin Wang , Tong Su , Ruijuan Chi , Guohui Fu
The navigation line serves as a datum for autonomous agricultural robots engaged in tasks such as monitoring, spraying, and fertilization, enabling them to traverse along crop rows. Although the above-canopy navigation line extraction methods for early-growth cornfields are advanced, they are difficult to apply to cornfields with tall plants and wide canopies in mid-to-late growth. Extracting navigation lines under the canopy using environment-aware sensors mounted at a lower position on robots is a viable option, but under-canopy complexities such as crossing leaves, varying exit crops, and light conditions pose a serious challenge to methods based on a single sensor. Therefore, this study proposes a feature-level fusion method by a monocular camera and 3D Light Detection and Ranging (LiDAR) using corn stems as references. This method includes three steps: (i) Semantic segmentation of the ground and corn stems in the image by the constructed StemFormer, which is a Transformer-based dual-branch network. (ii) After segmenting the ground and stem LiDAR point clouds based on the image semantic mask, the proposed adaptive radius filter is applied to filter the stem point cloud after dimensionality reduction based on the ground plane. (iii) The extraction of the navigation line under the corn canopy is achieved by clustering the stem point cloud using the Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm and fitting the clustering centers by the Least Squares Method (LSM). The experimental results validate the accuracy and real-time performance of the fusion method, achieving a mean correct rate of navigation line extraction at 93.67 %, a mean absolute error of heading angle at 1.53° with a standard deviation of 1.46°, and an overall maximum running time of 80.58 ms. The navigation line extraction method offers a novel strategy for the automated navigation of agricultural robots in fields with tall crops such as corn.
{"title":"Fusion of monocular camera and 3D LiDAR data for navigation line extraction under corn canopy","authors":"Chao Ban , Lin Wang , Tong Su , Ruijuan Chi , Guohui Fu","doi":"10.1016/j.compag.2025.110124","DOIUrl":"10.1016/j.compag.2025.110124","url":null,"abstract":"<div><div>The navigation line serves as a datum for autonomous agricultural robots engaged in tasks such as monitoring, spraying, and fertilization, enabling them to traverse along crop rows. Although the above-canopy navigation line extraction methods for early-growth cornfields are advanced, they are difficult to apply to cornfields with tall plants and wide canopies in mid-to-late growth. Extracting navigation lines under the canopy using environment-aware sensors mounted at a lower position on robots is a viable option, but under-canopy complexities such as crossing leaves, varying exit crops, and light conditions pose a serious challenge to methods based on a single sensor. Therefore, this study proposes a feature-level fusion method by a monocular camera and 3D Light Detection and Ranging (LiDAR) using corn stems as references. This method includes three steps: (i) Semantic segmentation of the ground and corn stems in the image by the constructed StemFormer, which is a Transformer-based dual-branch network. (ii) After segmenting the ground and stem LiDAR point clouds based on the image semantic mask, the proposed adaptive radius filter is applied to filter the stem point cloud after dimensionality reduction based on the ground plane. (iii) The extraction of the navigation line under the corn canopy is achieved by clustering the stem point cloud using the Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm and fitting the clustering centers by the Least Squares Method (LSM). The experimental results validate the accuracy and real-time performance of the fusion method, achieving a mean correct rate of navigation line extraction at 93.67 %, a mean absolute error of heading angle at 1.53° with a standard deviation of 1.46°, and an overall maximum running time of 80.58 ms. The navigation line extraction method offers a novel strategy for the automated navigation of agricultural robots in fields with tall crops such as corn.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110124"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Choy sum (Brassica rapa var. parachinensis) is a commonly grown leafy vegetable, primarily harvested for its stem. Accurate stem segmentation is crucial for accurate harvesting, yet the visual similarity between choy sum stems and leaves poses challenges for traditional supervised learning methods, making data labeling costly and affecting segmentation accuracy. This study introduces AD-DMT, an enhanced Dynamic Mutual Training (DMT) algorithm for semi-supervised segmentation, which improves on the original framework by incorporating: 1) Introduction of data augmentation strategies such as CutMix, brightness, and contrast adjustments to alleviate model generalization difficulties caused by data homogeneity; 2) The design of adaptive loss weights re-scaled factor (γ1 and γ2) dynamically adjusts the balance between mutual learning and entropy minimization based on training epochs; 3) A dynamic temperature coefficient is incorporated to enhance divergent learning in training by modulating Softmax output. For validation, images of field-grown choy sum were captured to evaluate AD-DMT’s performance under different labeled data ratios (1/2, 1/4, 1/8, 1/20). The results demonstrate efficient segmentation across all conditions, with mIoU values exceeding 84.0 %. Notably, even with minimal labeled data (1/20 ratio), AD-DMT achieved a 4.04 % improvement in mIoU over the baseline. Building on these segmentation results, we further determined the optimal cutting points of choy sum stems by using skeleton extraction and corner detection algorithms, calculating the three-dimensional coordinates of these points with depth images, achieving an average vertical offset error (VOE) within 6.29 mm.
{"title":"Dynamic mutual training semi-supervised semantic segmentation algorithm with adaptive capability (AD-DMT) for choy sum stem segmentation and 3D positioning of cutting points","authors":"Kai Yuan, Qian Wang, Zuoxi Zhao, Mengcheng Wu, Yuanqing Shui, Xiaonan Yang, Ruihan Xu","doi":"10.1016/j.compag.2025.110105","DOIUrl":"10.1016/j.compag.2025.110105","url":null,"abstract":"<div><div>Choy sum (<em>Brassica rapa</em> var. <em>parachinensis</em>) is a commonly grown leafy vegetable, primarily harvested for its stem. Accurate stem segmentation is crucial for accurate harvesting, yet the visual similarity between choy sum stems and leaves poses challenges for traditional supervised learning methods, making data labeling costly and affecting segmentation accuracy. This study introduces AD-DMT, an enhanced Dynamic Mutual Training (DMT) algorithm for semi-supervised segmentation, which improves on the original framework by incorporating: 1) Introduction of data augmentation strategies such as CutMix, brightness, and contrast adjustments to alleviate model generalization difficulties caused by data homogeneity; 2) The design of adaptive loss weights re-scaled factor (<em>γ<sub>1</sub></em> and <em>γ<sub>2</sub></em>) dynamically adjusts the balance between mutual learning and entropy minimization based on training epochs; 3) A dynamic temperature coefficient is incorporated to enhance divergent learning in training by modulating Softmax output. For validation, images of field-grown choy sum were captured to evaluate AD-DMT’s performance under different labeled data ratios (1/2, 1/4, 1/8, 1/20). The results demonstrate efficient segmentation across all conditions, with mIoU values exceeding 84.0 %. Notably, even with minimal labeled data (1/20 ratio), AD-DMT achieved a 4.04 % improvement in mIoU over the baseline. Building on these segmentation results, we further determined the optimal cutting points of choy sum stems by using skeleton extraction and corner detection algorithms, calculating the three-dimensional coordinates of these points with depth images, achieving an average vertical offset error (VOE) within 6.29 mm.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110105"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.compag.2025.110094
Cheng Cao , Pei Yang , Chaoyuan Tang, Fubin Liang, Jingshan Tian, Yali Zhang, Wangfeng Zhang
Extracting cotton boll phenotypic parameters from imaging data is a prerequisite for intelligently characterizing boll growth and development. However, current methods relying on manual measurements are inefficient and often inaccurate. To address this, we developed a cotton boll phenotypic parameter extraction program (CPVS), a tool designed to estimate the morphological characteristics of unopened cotton bolls from images. CPVS integrates semi-automatic data extraction with advanced algorithms to calculate length, width, volume, and surface area. Length and width estimation algorithms were developed using a custom “Fixed” image set, which links pixel dimensions to actual measurements. Volume and surface area models were based on shape classification using a custom “Random” image set, trait correlations, and measured data. Testing showed strong performance, with R2 values of 0.880 and 0.769 and root mean square error (RMSE) values of 0.173 and 0.188 for length and width, respectively. The volume model achieved an R2 of 0.91 and an RMSE of 1.76, while surface area models had R2 values of 0.76 and RMSEs of 2.37 and 2.41. These results indicate that CPVS is a robust tool, providing theoretical and practical support for efficient, accurate characterization of cotton boll morphology.
{"title":"Morphological characteristic extraction of unopened cotton bolls using image analysis and geometric modeling methods","authors":"Cheng Cao , Pei Yang , Chaoyuan Tang, Fubin Liang, Jingshan Tian, Yali Zhang, Wangfeng Zhang","doi":"10.1016/j.compag.2025.110094","DOIUrl":"10.1016/j.compag.2025.110094","url":null,"abstract":"<div><div>Extracting cotton boll phenotypic parameters from imaging data is a prerequisite for intelligently characterizing boll growth and development. However, current methods relying on manual measurements are inefficient and often inaccurate. To address this, we developed a cotton boll phenotypic parameter extraction program (CPVS), a tool designed to estimate the morphological characteristics of unopened cotton bolls from images. CPVS integrates semi-automatic data extraction with advanced algorithms to calculate length, width, volume, and surface area. Length and width estimation algorithms were developed using a custom “Fixed” image set, which links pixel dimensions to actual measurements. Volume and surface area models were based on shape classification using a custom “Random” image set, trait correlations, and measured data. Testing showed strong performance, with R<sup>2</sup> values of 0.880 and 0.769 and root mean square error (RMSE) values of 0.173 and 0.188 for length and width, respectively. The volume model achieved an R<sup>2</sup> of 0.91 and an RMSE of 1.76, while surface area models had R<sup>2</sup> values of 0.76 and RMSEs of 2.37 and 2.41. These results indicate that CPVS is a robust tool, providing theoretical and practical support for efficient, accurate characterization of cotton boll morphology.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110094"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.compag.2025.110070
Yupan Zhang , Yiliu Tan , Xin Xu , Hangkai You , Yuichi Onda , Takashi Gomi
In forest ecosystems, branch and leaf structures play crucial roles in hydrological and vegetative physiology. However, accurately characterizing branch and leaf structures in dense forest scenarios is challenging, limiting our understanding of how branch and leaf structures affect processes such as interception loss, stemflow, and throughfall. Both terrestrial and drone LiDAR technologies have demonstrated impressive performances in providing detailed insights into forest structures from different perspectives. By leveraging the fusion of point clouds, we classified the leaf and branch of three Japanese cypress trees. Leaf points occupied voxel space was calculated using voxelization, visible branches were fitted using line segments, and the angles and lengths of the invisible branches within the canopy were estimated using the tree-form coefficient. The quantitative analysis results showed that leaf points occupied voxel space at the single-tree and plot scales average were 0.89 ± 0.42 m3/m2. Then, 82, 53, and 58 visible branches were fitted and 23, 14, and 12 invisible branches were estimated for the three trees, respectively. Destructive harvesting was conducted on a single tree to assess the accuracy of branch identification and parameter extraction at the individual branch level. The results yielded an of 0.76 for branch identification and nRMSEs of 32.14 % for branch length and 13.68 % for branch angle, respectively. Our method solves the problem of extracting the branch and leaf structures of single trees in dense forest scenarios with heavy occlusion. The reconstructed tree model can be further applied to estimate tree attributes and canopy hydrology simulations accurately.
{"title":"Individual tree branch and leaf metrics extraction in dense plantation scenario through the fusion of drone and terrestrial LiDAR","authors":"Yupan Zhang , Yiliu Tan , Xin Xu , Hangkai You , Yuichi Onda , Takashi Gomi","doi":"10.1016/j.compag.2025.110070","DOIUrl":"10.1016/j.compag.2025.110070","url":null,"abstract":"<div><div>In forest ecosystems, branch and leaf structures play crucial roles in hydrological and vegetative physiology. However, accurately characterizing branch and leaf structures in dense forest scenarios is challenging, limiting our understanding of how branch and leaf structures affect processes such as interception loss, stemflow, and throughfall. Both terrestrial and drone LiDAR technologies have demonstrated impressive performances in providing detailed insights into forest structures from different perspectives. By leveraging the fusion of point clouds, we classified the leaf and branch of three Japanese cypress trees. Leaf points occupied voxel space was calculated using voxelization, visible branches were fitted using line segments, and the angles and lengths of the invisible branches within the canopy were estimated using the tree-form coefficient. The quantitative analysis results showed that leaf points occupied voxel space at the single-tree and plot scales average were 0.89 ± 0.42 m<sup>3</sup>/m<sup>2</sup>. Then, 82, 53, and 58 visible branches were fitted and 23, 14, and 12 invisible branches were estimated for the three trees, respectively. Destructive harvesting was conducted on a single tree to assess the accuracy of branch identification and parameter extraction at the individual branch level. The results yielded an <span><math><mrow><mi>F</mi><mn>1</mn><mo>-</mo><mi>s</mi><mi>c</mi><mi>o</mi><mi>r</mi><mi>e</mi></mrow></math></span> of 0.76 for branch identification and nRMSEs of 32.14 % for branch length and 13.68 % for branch angle, respectively. Our method solves the problem of extracting the branch and leaf structures of single trees in dense forest scenarios with heavy occlusion. The reconstructed tree model can be further applied to estimate tree attributes and canopy hydrology simulations accurately.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110070"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.compag.2025.110122
Haikuan Feng , Yiguang Fan , Jibo Yue , Mingbo Bian , Yang Liu , Riqiang Chen , Yanpeng Ma , Jiejie Fan , Guijun Yang , Chunjiang Zhao
Accurate estimation of above-ground biomass (AGB) in potato plants is essential for effective monitoring of potato growth and reliable yield prediction. Remote sensing technology has emerged as a promising method for monitoring crop growth parameters due to its high throughput, non-destructive nature, and rapid acquisition of information. However, the sensitivity of remote sensing vegetation indices to crop AGB parameters declines at moderate to high crop coverage, known as the “saturation phenomenon,” which limits accurate AGB monitoring during the mid-to-late growth stages. This challenge also hinders the development of a multi-growth-cycle AGB estimation model. In this study, a novel VGC-AGB model integrated with hyperspectral remote sensing was utilized for multi-stage estimation of potato AGB. This study consists of three main components: (1) addressing the “saturation problem” encountered when using spectral indices from remote sensing to monitor crop biomass across multiple growth stages. The VGC-AGB model calculates the leaf biomass by multiplying leaf dry mass content (Cm) and leaf area index (LAI) and vertical organ biomass using the multiplication of crop density (Cd), crop height (Ch) and the crop stem and reproductive organs’ average dry mass content (Csm); (2) estimating the VGC-AGB model parameters Cm and LAI by integrating hyperspectral remote sensing data with a deep learning model; (3) comparing the performance of three methods—(i) hyperspectral + Ch, (ii) ground-measured parameters + VGC-AGB model, and (iii) hyperspectral remote sensing + VGC-AGB model—using a five-year dataset of potato above-ground biomass. Results indicate that (1) the VGC-AGB model achieved high accuracy in estimating AGB (R2 = 0.853, RMSE = 751.12 kg/ha), significantly outperforming the deep learning model based on hyperspectral + Ch data (R2 = 0.683, RMSE = 1122.03 kg/ha); (2) the combination of the VGC-AGB model and hyperspectral remote sensing provided highly accurate results in estimating AGB (R2 = 0.760, RMSE = 965.59 kg/ha), surpassing the results obtained using the hyperspectral + Ch-based method. Future research will primarily focus on streamlining the acquisition of VGC-AGB model parameters, optimizing the acquisition and processing of remote sensing data, and enhancing model validation and application. Furthermore, it is essential to conduct cross-regional validation and optimize model parameters for various crops to improve the universality and adaptability of the proposed model.
{"title":"Estimation of potato above-ground biomass based on the VGC-AGB model and deep learning","authors":"Haikuan Feng , Yiguang Fan , Jibo Yue , Mingbo Bian , Yang Liu , Riqiang Chen , Yanpeng Ma , Jiejie Fan , Guijun Yang , Chunjiang Zhao","doi":"10.1016/j.compag.2025.110122","DOIUrl":"10.1016/j.compag.2025.110122","url":null,"abstract":"<div><div>Accurate estimation of above-ground biomass (AGB) in potato plants is essential for effective monitoring of potato growth and reliable yield prediction. Remote sensing technology has emerged as a promising method for monitoring crop growth parameters due to its high throughput, non-destructive nature, and rapid acquisition of information. However, the sensitivity of remote sensing vegetation indices to crop AGB parameters declines at moderate to high crop coverage, known as the “saturation phenomenon,” which limits accurate AGB monitoring during the mid-to-late growth stages. This challenge also hinders the development of a multi-growth-cycle AGB estimation model. In this study, a novel VGC-AGB model integrated with hyperspectral remote sensing was utilized for multi-stage estimation of potato AGB. This study consists of three main components: (1) addressing the “saturation problem” encountered when using spectral indices from remote sensing to monitor crop biomass across multiple growth stages. The VGC-AGB model calculates the leaf biomass by multiplying leaf dry mass content (Cm) and leaf area index (LAI) and vertical organ biomass using the multiplication of crop density (Cd), crop height (Ch) and the crop stem and reproductive organs’ average dry mass content (Csm); (2) estimating the VGC-AGB model parameters Cm and LAI by integrating hyperspectral remote sensing data with a deep learning model; (3) comparing the performance of three methods—(i) hyperspectral + Ch, (ii) ground-measured parameters + VGC-AGB model, and (iii) hyperspectral remote sensing + VGC-AGB model—using a five-year dataset of potato above-ground biomass. Results indicate that (1) the VGC-AGB model achieved high accuracy in estimating AGB (<em>R</em><sup>2</sup> = 0.853, RMSE = 751.12 kg/ha), significantly outperforming the deep learning model based on hyperspectral + Ch data (<em>R</em><sup>2</sup> = 0.683, RMSE = 1122.03 kg/ha); (2) the combination of the VGC-AGB model and hyperspectral remote sensing provided highly accurate results in estimating AGB (<em>R</em><sup>2</sup> = 0.760, RMSE = 965.59 kg/ha), surpassing the results obtained using the hyperspectral + Ch-based method. Future research will primarily focus on streamlining the acquisition of VGC-AGB model parameters, optimizing the acquisition and processing of remote sensing data, and enhancing model validation and application. Furthermore, it is essential to conduct cross-regional validation and optimize model parameters for various crops to improve the universality and adaptability of the proposed model.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110122"},"PeriodicalIF":7.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-16DOI: 10.1016/j.compag.2025.110115
Hazhir Bahrami , Karem Chokmani , Saeid Homayouni , Viacheslav I. Adamchuk , Md Saifuzzaman , Maxime Leduc
Among various types of forages, Alfalfa (Medicago sativa) is a crucial forage crop that plays a vital role in livestock nutrition and sustainable agriculture. As a result of its ability to adapt to different weather conditions and its high nitrogen fixation capability, this crop produces high-quality forage that contains between 15 and 22 % protein. It is fortunately possible to improve the overall prediction of forage biomass and quality prior to harvest through remote sensing technologies. The recent advent of deep Convolution Neural Networks (deep CNNs) enables researchers to utilize these incredible algorithms. This study aims to build a model to count the number of alfalfa stems from proximal images. To this end, we first utilized a deep CNN encoder-decoder to segment alfalfa and other background objects in a field, such as soil and grass. Subsequently, we employed the alfalfa cover fractions derived from the proximal images to develop and train machine learning regression models for estimating the stem count in the images. This study uses many proximal images taken from significant number of fields in four provinces of Canada over three consecutive years. A combination of real and synthetic images has been utilized to feed the deep neural network encoder-decoder. This study gathered roughly 3447 alfalfa images, 5332 grass images, and 9241 background images for training the encoder-decoder model. With data augmentation, we prepared about 60,000 annotated images of alfalfa fields containing alfalfa, grass, and background utilizing a pre-trained model in less than an hour. Several convolutional neural network encoder-decoder models have also been utilized in this study. Simple U-Net, Attention U-Net (Att U-Net), and ResU-Net with attention gates have been trained to detect alfalfa and differentiate it from other objects. The best Intersections over Union (IoU) for simple U-Net classes were 0.98, 0.93, and 0.80 for background, alfalfa and grass, respectively. Simple U-Net with synthetic data provides a promising result over unseen real images and requires an RGB iPad image for field-specific alfalfa detection. It was also observed that simple U-Net has slightly better accuracy than attention U-Net and attention ResU-Net. Finally, we built regression models between the alfalfa cover fraction in the original images taken by iPad, and the mean alfalfa stems per square foot. Random forest (RF), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGB) methods have been utilized to estimate the number of stems in the images. RF was the best model for estimating the number of alfalfa stems relative to other machine learning algorithms, with a coefficient of determination (R2) of 0.82, root-mean-square error of 13.00, and mean absolute error of 10.07.
{"title":"Alfalfa detection and stem count from proximal images using a combination of deep neural networks and machine learning","authors":"Hazhir Bahrami , Karem Chokmani , Saeid Homayouni , Viacheslav I. Adamchuk , Md Saifuzzaman , Maxime Leduc","doi":"10.1016/j.compag.2025.110115","DOIUrl":"10.1016/j.compag.2025.110115","url":null,"abstract":"<div><div>Among various types of forages, Alfalfa (Medicago sativa) is a crucial forage crop that plays a vital role in livestock nutrition and sustainable agriculture. As a result of its ability to adapt to different weather conditions and its high nitrogen fixation capability, this crop produces high-quality forage that contains between 15 and 22 % protein. It is fortunately possible to improve the overall prediction of forage biomass and quality prior to harvest through remote sensing technologies. The recent advent of deep Convolution Neural Networks (deep CNNs) enables researchers to utilize these incredible algorithms. This study aims to build a model to count the number of alfalfa stems from proximal images. To this end, we first utilized a deep CNN encoder-decoder to segment alfalfa and other background objects in a field, such as soil and grass. Subsequently, we employed the alfalfa cover fractions derived from the proximal images to develop and train machine learning regression models for estimating the stem count in the images. This study uses many proximal images taken from significant number of fields in four provinces of Canada over three consecutive years. A combination of real and synthetic images has been utilized to feed the deep neural network encoder-decoder. This study gathered roughly 3447 alfalfa images, 5332 grass images, and 9241 background images for training the encoder-decoder model. With data augmentation, we prepared about 60,000 annotated images of alfalfa fields containing alfalfa, grass, and background utilizing a pre-trained model in less than an hour. Several convolutional neural network encoder-decoder models have also been utilized in this study. Simple U-Net, Attention U-Net (Att U-Net), and ResU-Net with attention gates have been trained to detect alfalfa and differentiate it from other objects. The best Intersections over Union (IoU) for simple U-Net classes were 0.98, 0.93, and 0.80 for background, alfalfa and grass, respectively. Simple U-Net with synthetic data provides a promising result over unseen real images and requires an RGB iPad image for field-specific alfalfa detection. It was also observed that simple U-Net has slightly better accuracy than attention U-Net and attention ResU-Net. Finally, we built regression models between the alfalfa cover fraction in the original images taken by iPad, and the mean alfalfa stems per square foot. Random forest (RF), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGB) methods have been utilized to estimate the number of stems in the images. RF was the best model for estimating the number of alfalfa stems relative to other machine learning algorithms, with a coefficient of determination (R<sup>2</sup>) of 0.82, root-mean-square error of 13.00, and mean absolute error of 10.07.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110115"},"PeriodicalIF":7.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-16DOI: 10.1016/j.compag.2025.110123
Boyang Deng, Yuzhen Lu
Robust weed recognition for vision-guided weeding relies on curating large-scale, diverse field datasets, which however are practically difficult to come by. Text-to-image generative artificial intelligence opens new avenues for synthesizing perceptually realistic images beneficial for wide-ranging computer vision tasks in precision agriculture. This study investigates the efficacy of state-of-the-art diffusion models as an image augmentation technique for synthesizing multi-class weed images towards enhanced weed detection performance. A three-season 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. The ControlNet-added stable diffusion models were trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance (FID) and Inception Score (IS), resulting in an average FID of 0.98 and IS of 3.63. The generated weed images were selected to supplement real-world images for weed detection by YOLOv8-large. Combining the manually selected, generated images with real images yielded an overall mAP@50:95 of 88.3 % and mAP@50 of 95.0 %, representing performance gains of 1.4 % and 0.8 %, respectively, compared to the baseline model trained using only real images. It also performed competitively or comparably with modeling by combining real images with the images generated by external, traditional data augmentation techniques. The proposed automated post-generation image filtering approach still needs improvements to select high-quality images for enhanced weed detection. Both the weed dataset1 and software programs2 developed in this study have been made publicly available. Considerable research is needed to exploit more controllable diffusion models for generating high-fidelity, diverse weed images to substantially enhance weed detection in changing field conditions.
{"title":"Weed image augmentation by ControlNet-added stable diffusion for multi-class weed detection","authors":"Boyang Deng, Yuzhen Lu","doi":"10.1016/j.compag.2025.110123","DOIUrl":"10.1016/j.compag.2025.110123","url":null,"abstract":"<div><div>Robust weed recognition for vision-guided weeding relies on curating large-scale, diverse field datasets, which however are practically difficult to come by. Text-to-image generative artificial intelligence opens new avenues for synthesizing perceptually realistic images beneficial for wide-ranging computer vision tasks in precision agriculture. This study investigates the efficacy of state-of-the-art diffusion models as an image augmentation technique for synthesizing multi-class weed images towards enhanced weed detection performance. A three-season 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. The ControlNet-added stable diffusion models were trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance (FID) and Inception Score (IS), resulting in an average FID of 0.98 and IS of 3.63. The generated weed images were selected to supplement real-world images for weed detection by YOLOv8-large. Combining the manually selected, generated images with real images yielded an overall mAP@50:95 of 88.3 % and mAP@50 of 95.0 %, representing performance gains of 1.4 % and 0.8 %, respectively, compared to the baseline model trained using only real images. It also performed competitively or comparably with modeling by combining real images with the images generated by external, traditional data augmentation techniques. The proposed automated post-generation image filtering approach still needs improvements to select high-quality images for enhanced weed detection. Both the weed dataset<span><span><sup>1</sup></span></span> and software programs<span><span><sup>2</sup></span></span> developed in this study have been made publicly available. Considerable research is needed to exploit more controllable diffusion models for generating high-fidelity, diverse weed images to substantially enhance weed detection in changing field conditions.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110123"},"PeriodicalIF":7.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}