Pub Date : 2025-12-06DOI: 10.1016/j.aiia.2025.12.001
Honghao Zhou , Bingxi Qin , Qing Li , Wenlong Su , Shaowei Liang , Haijiang Min , Jingrong Zang , Shichao Jin , Dong Jiang , Jiawei Chen
Automated phenotyping of wheat growth stages from 3D point clouds is still limited. The study presents a concise framework that reconstructs multi-view UAS imagery into 3D point clouds (jointing to maturity) and performs plot-level phenotyping. A novel 3D wheat plot detection network—integrating spatial–channel coordinated attention and area attention modules—improves depth-direction feature recognition, and a point-cloud-density-based row segmentation algorithm enables planting-row-scale plot delineation. A supporting software system facilitates 3D visualization and automated extraction of phenotypic parameters. We introduce a dynamic phenotypic index of five temporal metrics (growth stage, slow growth stage, height/area reduction stage, maximum height/area difference stage, and height/area change rate) for growth-stage classification and yield prediction using static and time-series models. Experiments show strong agreement between predicted and measured plot heights (R2 = 0.937); the detection net achieved AP3D = 94.15 % and APBEV = 95.35 % in “easy” mode; and a Bi-LSTM incorporating dynamic traits reached 82.37 % prediction accuracy for leaf area and yield, a 6.14 % improvement over static-trait models. This workflow supports high-throughput 3D phenotyping and reliable yield estimation for precision agriculture.
{"title":"Integrating 3D detection networks and dynamic temporal phenotyping for wheat yield classification and prediction","authors":"Honghao Zhou , Bingxi Qin , Qing Li , Wenlong Su , Shaowei Liang , Haijiang Min , Jingrong Zang , Shichao Jin , Dong Jiang , Jiawei Chen","doi":"10.1016/j.aiia.2025.12.001","DOIUrl":"10.1016/j.aiia.2025.12.001","url":null,"abstract":"<div><div>Automated phenotyping of wheat growth stages from 3D point clouds is still limited. The study presents a concise framework that reconstructs multi-view UAS imagery into 3D point clouds (jointing to maturity) and performs plot-level phenotyping. A novel 3D wheat plot detection network—integrating spatial–channel coordinated attention and area attention modules—improves depth-direction feature recognition, and a point-cloud-density-based row segmentation algorithm enables planting-row-scale plot delineation. A supporting software system facilitates 3D visualization and automated extraction of phenotypic parameters. We introduce a dynamic phenotypic index of five temporal metrics (growth stage, slow growth stage, height/area reduction stage, maximum height/area difference stage, and height/area change rate) for growth-stage classification and yield prediction using static and time-series models. Experiments show strong agreement between predicted and measured plot heights (R<sup>2</sup> = 0.937); the detection net achieved AP<sub>3D</sub> = 94.15 % and AP<sub>BEV</sub> = 95.35 % in “easy” mode; and a Bi-LSTM incorporating dynamic traits reached 82.37 % prediction accuracy for leaf area and yield, a 6.14 % improvement over static-trait models. This workflow supports high-throughput 3D phenotyping and reliable yield estimation for precision agriculture.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 603-618"},"PeriodicalIF":12.4,"publicationDate":"2025-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145747432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1016/j.aiia.2025.11.011
Renato H. Furlanetto , Ana C. Buzanini , Arnold W. Schumann , Nathan S. Boyd
Targeted application aims to minimize product usage by spraying only where needed. However, there is a lack of tools to evaluate the potential savings and the model's limitations. To address this, we developed a computer simulation using two YOLOv8x models (bounding boxes and segmentation masks) to assess spray volume reduction (SVR) and nozzle limitations across varying weed densities. We used Python, OpenCV, and Tkinter to analyze videos from row-middle trials at the University of Florida. The system tracks weeds crossing a horizontal detection line within a 70 cm row-middle width, dividing the frame into one to eight vertical zones representing nozzle distribution. When a weed is detected, the corresponding frame is saved. The system calculates the activation time in seconds by considering the total number of video frames. Spray volume calculations were based on manual measurements of a TeeJet 8001VS nozzle tip, which dispenses a known volume of liquid per second at 35 PSI. The activation time was multiplied by this rate to estimate the targeted spray volume. The broadcast application volume was calculated by multiplying the total video duration by the same tip output. The results showed that the models achieved up to 96 % accuracy (mAP) with no statistical difference. A polynomial model for low and medium weed densities demonstrated SVR of 74 % (six nozzles) and 57 % (seven nozzles). A linear model for high density achieved a 40 % reduction. The lowest reduction occurred with a single nozzle (4 % for medium, 2 % for high density). These findings demonstrate that nozzle density significantly impacted spray reduction at medium and high densities while low-density savings remained consistent.
定向应用的目的是通过只在需要的地方喷洒来减少产品的使用。然而,缺乏工具来评估潜在的节省和模型的局限性。为了解决这个问题,我们开发了一个使用两个YOLOv8x模型(边界盒和分割掩模)的计算机模拟,以评估不同杂草密度下的喷雾体积减少(SVR)和喷嘴限制。我们使用Python、OpenCV和Tkinter来分析佛罗里达大学(University of Florida)中排试验的视频。该系统在一个70厘米宽的中线范围内跟踪穿过水平检测线的杂草,将框架划分为代表喷嘴分布的一到八个垂直区域。当检测到杂草时,保存相应的帧。系统通过考虑视频帧总数来计算激活时间(以秒为单位)。喷雾体积的计算是基于TeeJet 8001VS喷嘴尖端的手动测量,该喷嘴在35psi下每秒分配已知体积的液体。激活时间乘以这个速率来估计目标喷雾体积。通过将总视频持续时间乘以相同的提示输出来计算广播应用音量。结果表明,模型的准确率高达96%,无统计学差异。低、中杂草密度的多项式模型显示SVR分别为74%(6个喷嘴)和57%(7个喷嘴)。高密度的线性模型实现了40%的减少。单喷嘴的降低率最低(中等密度为4%,高密度为2%)。这些研究结果表明,在中高密度和低密度情况下,喷嘴密度显著影响了喷雾减少,而低密度情况下,喷嘴密度保持不变。
{"title":"Optimizing herbicide reduction: A simulation approach using Artificial Intelligence and different nozzle configurations","authors":"Renato H. Furlanetto , Ana C. Buzanini , Arnold W. Schumann , Nathan S. Boyd","doi":"10.1016/j.aiia.2025.11.011","DOIUrl":"10.1016/j.aiia.2025.11.011","url":null,"abstract":"<div><div>Targeted application aims to minimize product usage by spraying only where needed. However, there is a lack of tools to evaluate the potential savings and the model's limitations. To address this, we developed a computer simulation using two YOLOv8x models (bounding boxes and segmentation masks) to assess spray volume reduction (SVR) and nozzle limitations across varying weed densities. We used Python, OpenCV, and Tkinter to analyze videos from row-middle trials at the University of Florida. The system tracks weeds crossing a horizontal detection line within a 70 cm row-middle width, dividing the frame into one to eight vertical zones representing nozzle distribution. When a weed is detected, the corresponding frame is saved. The system calculates the activation time in seconds by considering the total number of video frames. Spray volume calculations were based on manual measurements of a TeeJet 8001VS nozzle tip, which dispenses a known volume of liquid per second at 35 PSI. The activation time was multiplied by this rate to estimate the targeted spray volume. The broadcast application volume was calculated by multiplying the total video duration by the same tip output. The results showed that the models achieved up to 96 % accuracy (mAP) with no statistical difference. A polynomial model for low and medium weed densities demonstrated SVR of 74 % (six nozzles) and 57 % (seven nozzles). A linear model for high density achieved a 40 % reduction. The lowest reduction occurred with a single nozzle (4 % for medium, 2 % for high density). These findings demonstrate that nozzle density significantly impacted spray reduction at medium and high densities while low-density savings remained consistent.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 592-602"},"PeriodicalIF":12.4,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145747431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1016/j.aiia.2025.11.010
Yiqiang Zheng , Jun Fu , Fengshuang Liu , Haiming Zhao , Jindai Liu
Classifying mechanical damage in maize kernels using hyperspectral imaging is crucial for food security and loss reduction. Existing methods are constrained by high computational complexity and limited precision in detecting subtle damages, such as pericarp damage and kernel cracks. To address these challenges, we introduce a novel algorithm, the Synergistic Convolution and Linear Attention Transformer (SCLFormer). By replacing traditional softmax attention with linear attention, we reduce computational complexity from quadratic to linear, thereby enhancing efficiency. Integrating convolutional operations into the encoder enriches local prior information for global feature modeling, improving classification accuracy. SCLFormer achieves an overall accuracy of 97.08 % in classifying maize kernel damage, with over 85 % accuracy for cracked kernel and pericarp damage. Compared to softmax attention, SCLFormer reduces training and testing times by 355.27 s (16.86 %) and 0.71 s (23.67 %), respectively. Additionally, we propose a modular hyperspectral image-level classification framework that can integrate existing pixel-level feature extraction networks to achieve classification accuracies exceeding 80 %, demonstrating the framework's scalability. SCLFormer, serving as the framework's dedicated feature extraction component, provides a robust solution for maize kernel damage classification and exhibits substantial potential for broader spatial-scale applications. This framework establishes a novel technical paradigm for hyperspectral image-wise classification of other agricultural products.
{"title":"SCLFormer: A synergistic convolution-linear attention transformer for hyperspectral image classification of mechanical damage in maize kernels","authors":"Yiqiang Zheng , Jun Fu , Fengshuang Liu , Haiming Zhao , Jindai Liu","doi":"10.1016/j.aiia.2025.11.010","DOIUrl":"10.1016/j.aiia.2025.11.010","url":null,"abstract":"<div><div>Classifying mechanical damage in maize kernels using hyperspectral imaging is crucial for food security and loss reduction. Existing methods are constrained by high computational complexity and limited precision in detecting subtle damages, such as pericarp damage and kernel cracks. To address these challenges, we introduce a novel algorithm, the Synergistic Convolution and Linear Attention Transformer (SCLFormer). By replacing traditional softmax attention with linear attention, we reduce computational complexity from quadratic to linear, thereby enhancing efficiency. Integrating convolutional operations into the encoder enriches local prior information for global feature modeling, improving classification accuracy. SCLFormer achieves an overall accuracy of 97.08 % in classifying maize kernel damage, with over 85 % accuracy for cracked kernel and pericarp damage. Compared to softmax attention, SCLFormer reduces training and testing times by 355.27 s (16.86 %) and 0.71 s (23.67 %), respectively. Additionally, we propose a modular hyperspectral image-level classification framework that can integrate existing pixel-level feature extraction networks to achieve classification accuracies exceeding 80 %, demonstrating the framework's scalability. SCLFormer, serving as the framework's dedicated feature extraction component, provides a robust solution for maize kernel damage classification and exhibits substantial potential for broader spatial-scale applications. This framework establishes a novel technical paradigm for hyperspectral image-wise classification of other agricultural products.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 565-577"},"PeriodicalIF":12.4,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145693322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1016/j.aiia.2025.11.006
Yizhi Zhang , Liping Chen , Xin Li , Qicheng Li , Jiangbo Li
Packaging is a crucial step in the commercial distribution of apples after harvest. However, achieving fast and accurate automated packaging remains a challenge. This study proposed for the first time a highly integrated four-arm robotic system and an effective implementation strategy specifically designed for automatic packaging of apples. In order to achieve accurate and simultaneous detection of multi-channel apples in complex visual conditions, an enhanced target detection model (i.e. YOLOv8s-GUC) based on the YOLOv8s architecture was developed by combining with a global attention mechanism (GAM), UniRepLKNet large kernel convolution structure, and CARAFE lightweight upsampling module. The results demonstrated that the YOLOv8s-GUC model can significantly improve detection accuracy and generalization for apples and related small features (such as stems and calyxes), achieving an [email protected] of 99.5 %. Furthermore, considering the complexity and real-time constraints of task planning in multi-arm robotic operations, this study further proposed an intelligent scheduling algorithm based on a Deep Q-Network (DQN), which enables efficient collaboration and collision-free operation among the robotic arms through real-time state perception and online decision-making. The results of simulations and real-world experiments indicated that the developed multi-arm robotic packaging system and scheduling strategy had high operational stability and efficiency in apple packaging, with a success rate of 100 % and an average packaging time of less than one second per apple. This study provides an effective and reliable solution for automated apple packaging.
{"title":"Multi-arm robotic system and strategy for the automatic packaging of apples","authors":"Yizhi Zhang , Liping Chen , Xin Li , Qicheng Li , Jiangbo Li","doi":"10.1016/j.aiia.2025.11.006","DOIUrl":"10.1016/j.aiia.2025.11.006","url":null,"abstract":"<div><div>Packaging is a crucial step in the commercial distribution of apples after harvest. However, achieving fast and accurate automated packaging remains a challenge. This study proposed for the first time a highly integrated four-arm robotic system and an effective implementation strategy specifically designed for automatic packaging of apples. In order to achieve accurate and simultaneous detection of multi-channel apples in complex visual conditions, an enhanced target detection model (i.e. YOLOv8s-GUC) based on the YOLOv8s architecture was developed by combining with a global attention mechanism (GAM), UniRepLKNet large kernel convolution structure, and CARAFE lightweight upsampling module. The results demonstrated that the YOLOv8s-GUC model can significantly improve detection accuracy and generalization for apples and related small features (such as stems and calyxes), achieving an [email protected] of 99.5 %. Furthermore, considering the complexity and real-time constraints of task planning in multi-arm robotic operations, this study further proposed an intelligent scheduling algorithm based on a Deep Q-Network (DQN), which enables efficient collaboration and collision-free operation among the robotic arms through real-time state perception and online decision-making. The results of simulations and real-world experiments indicated that the developed multi-arm robotic packaging system and scheduling strategy had high operational stability and efficiency in apple packaging, with a success rate of 100 % and an average packaging time of less than one second per apple. This study provides an effective and reliable solution for automated apple packaging.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 578-591"},"PeriodicalIF":12.4,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145693435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Organ segmentation of plant point clouds is a prerequisite for the high-resolution and accurate extraction of organ-level phenotypic traits. Although the fast development of deep learning has boosted much research on segmentation of plant point clouds, the existing techniques for organ segmentation still face limitations in resolution, segmentation accuracy, and generalizability across various plant species. In this study, we proposed a novel approach called plant segmentation neural radiance fields (PlantSegNeRF), aiming to directly generate high-precision instance point clouds from multi-view RGB image sequences for a wide range of plant species. PlantSegNeRF performed two-dimensional (2D) instance segmentation on the multi-view images to generate instance masks for each organ with a corresponding instance identification (ID). The multi-view instance IDs corresponding to the same plant organ were then matched and refined using a specially designed instance matching (IM) module. The instance NeRF was developed to render an implicit scene containing color, density, semantic and instance information, which was ultimately converted into high-precision plant instance point clouds based on volume density. The results proved that in semantic segmentation of point clouds, PlantSegNeRF outperformed the commonly used methods, demonstrating an average improvement of 16.1 %, 18.3 %, 17.8 %, and 24.2 % in precision, recall, F1-score, and intersection over union (IoU) compared to the second-best results on structurally complex datasets. More importantly, PlantSegNeRF exhibited significant advantages in instance segmentation. Across all plant datasets, it achieved average improvements of 11.7 %, 38.2 %, 32.2 % and 25.3 % in mean precision (mPrec), mean recall (mRec), mean coverage (mCov), and mean weighted coverage (mWCov), respectively. Furthermore, PlantSegNeRF demonstrates superior few-shot, cross-species performance, requiring only multi-view images of few plants to train models applicable to specific or similar varieties. This study extends organ-level plant phenotyping and provides a high-throughput way to supply high-quality 3D data for developing large-scale artificial intelligence (AI) models in plant science.
{"title":"PlantSegNeRF: A few-shot, cross-species method for plant 3D instance point cloud reconstruction via joint-channel NeRF with multi-view image instance matching","authors":"Xin Yang , Ruiming Du , Hanyang Huang , Jiayang Xie , Pengyao Xie , Leisen Fang , Ziyue Guo , Nanjun Jiang , Yu Jiang , Haiyan Cen","doi":"10.1016/j.aiia.2025.11.009","DOIUrl":"10.1016/j.aiia.2025.11.009","url":null,"abstract":"<div><div>Organ segmentation of plant point clouds is a prerequisite for the high-resolution and accurate extraction of organ-level phenotypic traits. Although the fast development of deep learning has boosted much research on segmentation of plant point clouds, the existing techniques for organ segmentation still face limitations in resolution, segmentation accuracy, and generalizability across various plant species. In this study, we proposed a novel approach called plant segmentation neural radiance fields (PlantSegNeRF), aiming to directly generate high-precision instance point clouds from multi-view RGB image sequences for a wide range of plant species. PlantSegNeRF performed two-dimensional (2D) instance segmentation on the multi-view images to generate instance masks for each organ with a corresponding instance identification (ID). The multi-view instance IDs corresponding to the same plant organ were then matched and refined using a specially designed instance matching (IM) module. The instance NeRF was developed to render an implicit scene containing color, density, semantic and instance information, which was ultimately converted into high-precision plant instance point clouds based on volume density. The results proved that in semantic segmentation of point clouds, PlantSegNeRF outperformed the commonly used methods, demonstrating an average improvement of 16.1 %, 18.3 %, 17.8 %, and 24.2 % in precision, recall, F1-score, and intersection over union (IoU) compared to the second-best results on structurally complex datasets. More importantly, PlantSegNeRF exhibited significant advantages in instance segmentation. Across all plant datasets, it achieved average improvements of 11.7 %, 38.2 %, 32.2 % and 25.3 % in mean precision (mPrec), mean recall (mRec), mean coverage (mCov), and mean weighted coverage (mWCov), respectively. Furthermore, PlantSegNeRF demonstrates superior few-shot, cross-species performance, requiring only multi-view images of few plants to train models applicable to specific or similar varieties. This study extends organ-level plant phenotyping and provides a high-throughput way to supply high-quality 3D data for developing large-scale artificial intelligence (AI) models in plant science.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 546-564"},"PeriodicalIF":12.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145623318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-22DOI: 10.1016/j.aiia.2025.11.008
Zhi-xin Yao , Hao Wang , Zhi-jun Meng , Liang-liang Yang , Tai-hong Zhang
To improve environmental perception and ensure reliable agricultural machinery navigation during field transitions under unstructured farm road conditions, this study utilizes high-resolution RGB camera vision navigation technology to propose a Multi-Scale Feature Alignment Network (MSFA-Net) for 19-class semantic segmentation of agricultural environment, which includes information such as roads, pedestrians, and vehicles. MSFA-Net introduces two key innovations: the DASP module, which integrates multi-scale feature extraction with dual attention mechanisms (spatial and channel), and the MSFA architecture, which enables robust boundary extraction and mitigates interference from lighting variations and obstacles like vegetation. Compared to existing models, MSFA-Net uniquely combines efficient multi-scale feature extraction with real-time inference capabilities, achieving an mIoU of 84.46 % and an mPA of 96.10 %. For 512 × 512 input images, the model processes an average of 26 images/s on a GTX 1650Ti, with a boundary extraction error of less than 0.47 m within 20 m. These results indicate that the proposed MSFA-Net can significantly reduce navigation errors and improve the perception stability of agricultural machinery during field operations. Furthermore, the model can be exported to ONNX or TensorFlow Lite formats, facilitating efficient deployment on embedded devices and existing farm navigation systems.
{"title":"Multi-scale feature alignment network for 19-class semantic segmentation in agricultural environments","authors":"Zhi-xin Yao , Hao Wang , Zhi-jun Meng , Liang-liang Yang , Tai-hong Zhang","doi":"10.1016/j.aiia.2025.11.008","DOIUrl":"10.1016/j.aiia.2025.11.008","url":null,"abstract":"<div><div>To improve environmental perception and ensure reliable agricultural machinery navigation during field transitions under unstructured farm road conditions, this study utilizes high-resolution RGB camera vision navigation technology to propose a Multi-Scale Feature Alignment Network (MSFA-Net) for 19-class semantic segmentation of agricultural environment, which includes information such as roads, pedestrians, and vehicles. MSFA-Net introduces two key innovations: the DASP module, which integrates multi-scale feature extraction with dual attention mechanisms (spatial and channel), and the MSFA architecture, which enables robust boundary extraction and mitigates interference from lighting variations and obstacles like vegetation. Compared to existing models, MSFA-Net uniquely combines efficient multi-scale feature extraction with real-time inference capabilities, achieving an mIoU of 84.46 % and an mPA of 96.10 %. For 512 × 512 input images, the model processes an average of 26 images/s on a GTX 1650Ti, with a boundary extraction error of less than 0.47 m within 20 m. These results indicate that the proposed MSFA-Net can significantly reduce navigation errors and improve the perception stability of agricultural machinery during field operations. Furthermore, the model can be exported to ONNX or TensorFlow Lite formats, facilitating efficient deployment on embedded devices and existing farm navigation systems.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 529-545"},"PeriodicalIF":12.4,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145623317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-15DOI: 10.1016/j.aiia.2025.11.005
Yuhao Zhou , Xiao Feng , Shuqi Tang , Jinpeng Yang , Shaobin Chen , Xiangbao Meng , Zhanpeng Liang , Ruijun Ma , Long Qi
Accurate detection of residual unfilled grains on threshed rice panicles is a critical step in determining a reliable grain-setting rate, and holds significant potential for the development of high-quality rice strains. Recent deep learning-based techniques have been actively explored for discerning various types of objects. However, this detection task is challenging, as many objects are densely occluded by branches or other unfilled grains. Additionally, some unfilled grains are closely adjacent and exhibit small sizes, further complicating the detection process. To address these challenges, this paper proposes a novel Channel-global Spatial-local Dual Attention (CSDA) module, aimed at enhancing feature correlation learning and contextual information embedding. Specifically, the channel- and spatial-wise attention are deployed on two parallel branches, and incorporated with the global and local representation learning paradigm, respectively. Furthermore, we integrate the CSDA module with the backbone of an object detector, and refine the loss function and detection head using the Focaler-SIoU and tiny object prediction head. This enables the object detector to effectively differentiate residual unfilled grains from occlusions, and at the meantime, focusing on the subtle differences between closely adjacent and small-sized unfilled grains. Experimental results show that our work achieves superior detection performance versus other competitors with an [email protected] of 95.3 % (outperforming rivals by 1.5–32.6 %) and a frame rate of 154 FPS (outperforming rivals by 12–132 FPS), enjoying substantial potentials for practical applications.
{"title":"Dual attention guided context-aware feature learning for residual unfilled grains detection on threshed rice panicles","authors":"Yuhao Zhou , Xiao Feng , Shuqi Tang , Jinpeng Yang , Shaobin Chen , Xiangbao Meng , Zhanpeng Liang , Ruijun Ma , Long Qi","doi":"10.1016/j.aiia.2025.11.005","DOIUrl":"10.1016/j.aiia.2025.11.005","url":null,"abstract":"<div><div>Accurate detection of residual unfilled grains on threshed rice panicles is a critical step in determining a reliable grain-setting rate, and holds significant potential for the development of high-quality rice strains. Recent deep learning-based techniques have been actively explored for discerning various types of objects. However, this detection task is challenging, as many objects are densely occluded by branches or other unfilled grains. Additionally, some unfilled grains are closely adjacent and exhibit small sizes, further complicating the detection process. To address these challenges, this paper proposes a novel Channel-global Spatial-local Dual Attention (CSDA) module, aimed at enhancing feature correlation learning and contextual information embedding. Specifically, the channel- and spatial-wise attention are deployed on two parallel branches, and incorporated with the global and local representation learning paradigm, respectively. Furthermore, we integrate the CSDA module with the backbone of an object detector, and refine the loss function and detection head using the Focaler-SIoU and tiny object prediction head. This enables the object detector to effectively differentiate residual unfilled grains from occlusions, and at the meantime, focusing on the subtle differences between closely adjacent and small-sized unfilled grains. Experimental results show that our work achieves superior detection performance versus other competitors with an [email protected] of 95.3 % (outperforming rivals by 1.5–32.6 %) and a frame rate of 154 FPS (outperforming rivals by 12–132 FPS), enjoying substantial potentials for practical applications.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 514-528"},"PeriodicalIF":12.4,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145623316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.1016/j.aiia.2025.10.016
Guohua Gao, Lifa Fang, Zihua Zhang, Jiahao Li
Automated pruning and defoliation of tomato plants are essential in modern cultivation systems for optimizing canopy structure, enhancing air circulation, and increasing yield. However, detecting main stems in the field faces significant challenges, like complex background interference, limited field of view, dense foliage occlusion, and curved stems. To address those challenges while ensuring hardware friendliness, computational efficiency, and real-time response, this study proposes a lightweight tomato main stem detection, optimisation, and deployment scheme. First, an efficient semi-automatic rotated bounding box annotation strategy is employed to segment the visible main stem segments, thus improving the adaptability to curved stems. Second, the lightweight network, YOLOR-Slim, is constructed to significantly reduce model complexity while maintaining detection performance through automated iterative pruning at the group-level of channel importance and a hybrid feature-based and logic-based knowledge distillation mechanism. Finally, an efficient and real-time main stem detection is achieved by deploying the model on inference engines and embedded platforms with various types and quantization bits. Experimental results showed that YOLOR-Slim achieved 87.5 % mAP@50, 1.9G Flops, 1.4 M parameters, and 7.4 ms inference time (pre-processing, inference, and post-processing) on the workstation, representing reductions of 2.8 %, 10.0 M, and 27.5G compared to the original model. After conversion with TensorRT, the inference time on Jetson Nano reached 57.6 ms, validating the operational efficiency and deployment applicability on edge devices. The YOLOR-Slim strikes a balance between inference speed, computational resources usage, and detection accuracy, providing a reliable perceptual foundation for automated pruning tasks in precision agriculture.
{"title":"Advancing lightweight and efficient detection of tomato main stems for edge device deployment","authors":"Guohua Gao, Lifa Fang, Zihua Zhang, Jiahao Li","doi":"10.1016/j.aiia.2025.10.016","DOIUrl":"10.1016/j.aiia.2025.10.016","url":null,"abstract":"<div><div>Automated pruning and defoliation of tomato plants are essential in modern cultivation systems for optimizing canopy structure, enhancing air circulation, and increasing yield. However, detecting main stems in the field faces significant challenges, like complex background interference, limited field of view, dense foliage occlusion, and curved stems. To address those challenges while ensuring hardware friendliness, computational efficiency, and real-time response, this study proposes a lightweight tomato main stem detection, optimisation, and deployment scheme. First, an efficient semi-automatic rotated bounding box annotation strategy is employed to segment the visible main stem segments, thus improving the adaptability to curved stems. Second, the lightweight network, YOLOR-Slim, is constructed to significantly reduce model complexity while maintaining detection performance through automated iterative pruning at the group-level of channel importance and a hybrid feature-based and logic-based knowledge distillation mechanism. Finally, an efficient and real-time main stem detection is achieved by deploying the model on inference engines and embedded platforms with various types and quantization bits. Experimental results showed that YOLOR-Slim achieved 87.5 % mAP@50, 1.9G Flops, 1.4 M parameters, and 7.4 ms inference time (pre-processing, inference, and post-processing) on the workstation, representing reductions of 2.8 %, 10.0 M, and 27.5G compared to the original model. After conversion with TensorRT, the inference time on Jetson Nano reached 57.6 ms, validating the operational efficiency and deployment applicability on edge devices. The YOLOR-Slim strikes a balance between inference speed, computational resources usage, and detection accuracy, providing a reliable perceptual foundation for automated pruning tasks in precision agriculture.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 458-479"},"PeriodicalIF":12.4,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.1016/j.aiia.2025.11.004
Wenxiang Xu , Yejun Zhu , Maohua Xiao , Mengnan Liu , Liling Ye , Yanpeng Yang , Ze Liu
The distributed drive electric plant protection vehicle (DDEPPV), equipped with a unique four-wheel independent drive system, demonstrates excellent path-tracking capability and dynamic performance in agricultural environments. However, during actual field operations, issues such as severe tire slip, poor driving stability, high rollover risk, and excessive energy consumption often arise due to improper torque distribution. This study proposes an energy-efficient and stability-enhancing control method based on active torque distribution, aiming to improve both operational safety and system efficiency. A hierarchical control architecture is adopted: the upper-level controller employs a nonlinear model predictive control (NMPC) to achieve coordinated control of steering and yaw moment, enhancing lateral stability and ensuring operational safety. The lower-level controller implements a direct torque allocation method based on an adaptive-weight multi-objective twin delayed deep deterministic policy gradient (AW-MO-TD3) algorithm, enabling joint optimization of tire slip ratio and energy consumption. Real-vehicle tests were conducted under two typical field conditions, and the results show that compared with conventional methods, the proposed strategy significantly improves key performance metrics including tracking accuracy, vehicle stability, and energy efficiency. Specifically, stability was enhanced by 29.1 % and 41.4 %, while energy consumption was reduced by 19.8 % and 21.1 % under dry plowed terrain and muddy rice field conditions, respectively. This research provides technical support for the intelligent control of distributed drive electric agricultural vehicles.
{"title":"Energy-saving and stability-enhancing control for unmanned distributed drive electric plant protection vehicle based on active torque distribution","authors":"Wenxiang Xu , Yejun Zhu , Maohua Xiao , Mengnan Liu , Liling Ye , Yanpeng Yang , Ze Liu","doi":"10.1016/j.aiia.2025.11.004","DOIUrl":"10.1016/j.aiia.2025.11.004","url":null,"abstract":"<div><div>The distributed drive electric plant protection vehicle (DDEPPV), equipped with a unique four-wheel independent drive system, demonstrates excellent path-tracking capability and dynamic performance in agricultural environments. However, during actual field operations, issues such as severe tire slip, poor driving stability, high rollover risk, and excessive energy consumption often arise due to improper torque distribution. This study proposes an energy-efficient and stability-enhancing control method based on active torque distribution, aiming to improve both operational safety and system efficiency. A hierarchical control architecture is adopted: the upper-level controller employs a nonlinear model predictive control (NMPC) to achieve coordinated control of steering and yaw moment, enhancing lateral stability and ensuring operational safety. The lower-level controller implements a direct torque allocation method based on an adaptive-weight multi-objective twin delayed deep deterministic policy gradient (AW-MO-TD3) algorithm, enabling joint optimization of tire slip ratio and energy consumption. Real-vehicle tests were conducted under two typical field conditions, and the results show that compared with conventional methods, the proposed strategy significantly improves key performance metrics including tracking accuracy, vehicle stability, and energy efficiency. Specifically, stability was enhanced by 29.1 % and 41.4 %, while energy consumption was reduced by 19.8 % and 21.1 % under dry plowed terrain and muddy rice field conditions, respectively. This research provides technical support for the intelligent control of distributed drive electric agricultural vehicles.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 495-513"},"PeriodicalIF":12.4,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145623315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.1016/j.aiia.2025.10.017
Wenjie Ai , Guofeng Yang , Zhongren Li , Jiawei Du , Lingzhen Ye , Xuping Feng , Xiangping Jin , Yong He
Leaf chlorophyll content serves as a critical biophysical indicator for characterizing wheat growth status. Traditional measurement using a SPAD meter, while convenient, is hampered by its localized sampling, low efficiency, and destructive nature, making it unsuitable for high-throughput field applications. To overcome these constraints, this research developed a novel approach for assessing canopy SPAD values in winter wheat by leveraging multispectral imagery obtained from an unmanned aerial vehicle (UAV). The generalizability of this methodology was rigorously evaluated through a replication experiment conducted in a subsequent growing season. Throughout the study, canopy reflectance data were acquired across key phenological stages and paired with synchronized ground-based SPAD measurements to construct stage-specific estimation models. The acquired multispectral images were processed to remove soil background interference, from which 17 distinct vegetation indices and 8 texture features were subsequently extracted. An in-depth examination followed, aiming to clarify the evolving interplay of these features with SPAD values throughout growth phases. Among the vegetation indices, the Modified Climate Change Canopy Vegetation Index (MCCCI) displayed a “rise-and-decline” pattern across the season, aligning with the crop's intrinsic growth dynamics and establishing it as a robust and phonologically interpretable proxy. Texture features, particularly contrast and entropy, demonstrated notable associations with SPAD values, reaching their peak strength during the booting stage. Comparative evaluation of various predictive modeling techniques revealed that a Support Vector Regression (SVR) model integrating both vegetation indices and texture features yielded the highest estimation accuracy. This integrated model outperformed models based solely on spectral or textural data, improving estimation accuracy by 23.81 % and 22.48 %, respectively. The model's strong generalization capability was further confirmed on the independent validation dataset from the second year (RMSE = 2.54, R2 = 0.748). In summary, this study establishes an effective and transferable framework for non-destructively monitoring chlorophyll content in winter wheat canopies using UAV data.
{"title":"Two-year remote sensing and ground verification: Estimating chlorophyll content in winter wheat using UAV multi-spectral imagery","authors":"Wenjie Ai , Guofeng Yang , Zhongren Li , Jiawei Du , Lingzhen Ye , Xuping Feng , Xiangping Jin , Yong He","doi":"10.1016/j.aiia.2025.10.017","DOIUrl":"10.1016/j.aiia.2025.10.017","url":null,"abstract":"<div><div>Leaf chlorophyll content serves as a critical biophysical indicator for characterizing wheat growth status. Traditional measurement using a SPAD meter, while convenient, is hampered by its localized sampling, low efficiency, and destructive nature, making it unsuitable for high-throughput field applications. To overcome these constraints, this research developed a novel approach for assessing canopy SPAD values in winter wheat by leveraging multispectral imagery obtained from an unmanned aerial vehicle (UAV). The generalizability of this methodology was rigorously evaluated through a replication experiment conducted in a subsequent growing season. Throughout the study, canopy reflectance data were acquired across key phenological stages and paired with synchronized ground-based SPAD measurements to construct stage-specific estimation models. The acquired multispectral images were processed to remove soil background interference, from which 17 distinct vegetation indices and 8 texture features were subsequently extracted. An in-depth examination followed, aiming to clarify the evolving interplay of these features with SPAD values throughout growth phases. Among the vegetation indices, the Modified Climate Change Canopy Vegetation Index (MCCCI) displayed a “rise-and-decline” pattern across the season, aligning with the crop's intrinsic growth dynamics and establishing it as a robust and phonologically interpretable proxy. Texture features, particularly contrast and entropy, demonstrated notable associations with SPAD values, reaching their peak strength during the booting stage. Comparative evaluation of various predictive modeling techniques revealed that a Support Vector Regression (SVR) model integrating both vegetation indices and texture features yielded the highest estimation accuracy. This integrated model outperformed models based solely on spectral or textural data, improving estimation accuracy by 23.81 % and 22.48 %, respectively. The model's strong generalization capability was further confirmed on the independent validation dataset from the second year (RMSE = 2.54, R<sup>2</sup> = 0.748). In summary, this study establishes an effective and transferable framework for non-destructively monitoring chlorophyll content in winter wheat canopies using UAV data.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 480-494"},"PeriodicalIF":12.4,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}