首页 > 最新文献

Artificial Intelligence in Agriculture最新文献

英文 中文
Transfer learning-based soybean LAI estimations by integrating PROSAIL, UAV, and PlanetScope imagery 整合PROSAIL、UAV和PlanetScope图像的基于迁移学习的大豆LAI估计
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-11-02 DOI: 10.1016/j.aiia.2025.10.018
Qing Li , Yanan Wei , Dalei Hao , Weijian Yu , Yelu Zeng
Accurate Leaf Area Index (LAI) estimations at the soybean plot scale is achievable using high-resolution Unmanned Aerial Vehicle (UAV) imagery and field measurement samples. However, the limited coverage of UAV flights restricts large-scale remote sensing monitoring in expansive soybean fields. This study leverages the broad coverage and 3-m resolution of PlanetScope satellite imagery to extend LAI prediction from UAV to satellite scales through transfer learning, using UAV-scale LAI estimates as a benchmark to validate cross-scale consistency. To address this challenge, this study proposed the LAI-TransNet, a two-stage transfer learning framework designed for precise and scalable soybean LAI prediction across large areas, demonstrating its effectiveness in cross-scale monitoring. In Stage 1, a UAV-scale benchmark is established using PROSAIL-simulated UAV reflectance data (UAV-Sim) and field-measured soybean LAI. Traditional machine learning, deep learning, and transfer learning models are trained on a hybrid UAV-Sim and field-measured dataset (UAV-Sim_Measured), with the transfer learning model CNN-TL, fine-tuned using pre-trained weights derived from UAV-Sim, achieving the highest accuracy (R2 = 0.81, RMSE = 0.64 m2/m2, rRMSE = 11.5 %). In Stage 2, LAI-TransNet is developed by fine-tuning the CNN-TL model on PlanetScope simulated data (PS-Sim), preprocessed via cross-domain mapping to align UAV and satellite spectral features. Real PlanetScope imagery is corrected for reflectance consistency with reference to UAV imagery spectral profiles. LAI-TransNet outperforms other deep learning models trained directly on PS-Sim (R2 = 0.69 vs. 0.60–0.63), ensuring robust cross-scale consistency. By bridging UAV and satellite scales, LAI-TransNet enables large-scale soybean LAI monitoring, enhancing precision agriculture management through improved monitoring with the PlanetScope imagery.
利用高分辨率无人机(UAV)图像和实地测量样本,可以在大豆地块尺度上实现精确的叶面积指数(LAI)估算。然而,无人机飞行覆盖范围有限,限制了大面积大豆田的大规模遥感监测。本研究利用PlanetScope卫星图像的广泛覆盖范围和3米分辨率,通过迁移学习将LAI预测从无人机尺度扩展到卫星尺度,以无人机尺度LAI估计为基准验证跨尺度一致性。为了应对这一挑战,本研究提出了LAI- transnet,这是一个两阶段迁移学习框架,旨在精确和可扩展地预测大面积的大豆LAI,证明了其在跨尺度监测中的有效性。第一阶段,利用prosail模拟无人机反射率数据(UAV- sim)和大豆LAI实测数据,建立无人机尺度基准。传统的机器学习、深度学习和迁移学习模型在UAV-Sim和现场测量数据集(UAV-Sim_Measured)上进行训练,迁移学习模型CNN-TL使用来自UAV-Sim的预训练权值进行微调,达到了最高的精度(R2 = 0.81, RMSE = 0.64 m2/m2, rRMSE = 11.5%)。在第二阶段,ai - transnet是通过在PlanetScope模拟数据(PS-Sim)上微调CNN-TL模型来开发的,并通过跨域映射进行预处理,以对准无人机和卫星的光谱特征。真实的PlanetScope图像与无人机图像光谱轮廓的反射率一致性进行了校正。ai - transnet优于直接在PS-Sim上训练的其他深度学习模型(R2 = 0.69 vs. 0.60-0.63),确保了鲁棒的跨尺度一致性。通过连接无人机和卫星尺度,LAI- transnet实现了大规模的大豆LAI监测,通过改进PlanetScope图像的监测,加强了精准农业管理。
{"title":"Transfer learning-based soybean LAI estimations by integrating PROSAIL, UAV, and PlanetScope imagery","authors":"Qing Li ,&nbsp;Yanan Wei ,&nbsp;Dalei Hao ,&nbsp;Weijian Yu ,&nbsp;Yelu Zeng","doi":"10.1016/j.aiia.2025.10.018","DOIUrl":"10.1016/j.aiia.2025.10.018","url":null,"abstract":"<div><div>Accurate Leaf Area Index (LAI) estimations at the soybean plot scale is achievable using high-resolution Unmanned Aerial Vehicle (UAV) imagery and field measurement samples. However, the limited coverage of UAV flights restricts large-scale remote sensing monitoring in expansive soybean fields. This study leverages the broad coverage and 3-m resolution of PlanetScope satellite imagery to extend LAI prediction from UAV to satellite scales through transfer learning, using UAV-scale LAI estimates as a benchmark to validate cross-scale consistency. To address this challenge, this study proposed the LAI-TransNet, a two-stage transfer learning framework designed for precise and scalable soybean LAI prediction across large areas, demonstrating its effectiveness in cross-scale monitoring. In Stage 1, a UAV-scale benchmark is established using PROSAIL-simulated UAV reflectance data (UAV-Sim) and field-measured soybean LAI. Traditional machine learning, deep learning, and transfer learning models are trained on a hybrid UAV-Sim and field-measured dataset (UAV-Sim_Measured), with the transfer learning model CNN-TL, fine-tuned using pre-trained weights derived from UAV-Sim, achieving the highest accuracy (<em>R</em><sup>2</sup> = 0.81, RMSE = 0.64 m<sup>2</sup>/m<sup>2</sup>, rRMSE = 11.5 %). In Stage 2, LAI-TransNet is developed by fine-tuning the CNN-TL model on PlanetScope simulated data (PS-Sim), preprocessed via cross-domain mapping to align UAV and satellite spectral features. Real PlanetScope imagery is corrected for reflectance consistency with reference to UAV imagery spectral profiles. LAI-TransNet outperforms other deep learning models trained directly on PS-Sim (<em>R</em><sup>2</sup> = 0.69 vs. 0.60–0.63), ensuring robust cross-scale consistency. By bridging UAV and satellite scales, LAI-TransNet enables large-scale soybean LAI monitoring, enhancing precision agriculture management through improved monitoring with the PlanetScope imagery.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 365-380"},"PeriodicalIF":12.4,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145527909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart agriculture technology: Real-time generation method of local soil property distribution maps based on WGAN-GPM 智慧农业技术:基于WGAN-GPM的局部土壤属性分布图实时生成方法
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-31 DOI: 10.1016/j.aiia.2025.10.014
Xiaoshuang Zhang , Li Yang , Dongxing Zhang , Tao Cui , Xiantao He , Kailiang Zhang , Zhaohui Du , Zhimin Li , Hongsheng Li , Tianpu Xiao
With the advancement of smart agriculture, precision variable-rate seeding requires high-resolution soil information. However, existing methods still fall short in generating localized soil property maps in real time, limiting inter-row seeding control for wide-span planters. To address this challenge, this paper proposes the Local Soil property distribution maps Generation Network (LSGN), based on the Wasserstein Generative Adversarial Network with Gradient Penalty and Mean squared error (WGAN-GPM). This proposed model integrates a Vision Transformer (ViT) encoder with convolutional and deconvolutional Residual Networks (ResNet), significantly improving feature extraction from discrete soil property data and enhancing the reconstruction accuracy of localized soil maps. The model adopts a two-stage training strategy. First, it performs self-supervised pre-training using local soil property maps with neighborhood information, enabling the network to learn spatial correlations and improve boundary prediction. Second, the pretrained weights are transferred into a WGAN-GPM adversarial training framework, which combines Wasserstein distance, gradient penalty, and Mean Squared Error (MSE) loss to jointly optimize the generator and discriminator. This ensures both pixel-level accuracy and distributional consistency. Compared to standard Generative Adversarial Networks (GANs), the WGAN-GPM-trained LSGN prediction reduces Relative Error Average (REA) by over 1.02 %, RMSE by over 0.16, and Kullback–Leibler Divergence (KLD) by over 1.76 × 10−4, while increasing Peak Signal-to-Noise Ratio (PSNR) by over 2.43. LSGN achieves REA values of 0.41 %, 1.19 %, and 1.80 % for planting widths of four, six, and eight rows, respectively, outperforming traditional generative models. Field cross-validation further shows that, relative to Kriging interpolation, LSGN reduces boundary region prediction errors by up to 2.42 % and central region errors by 0.97 %. Overall, the REA in local distribution maps decreases by more than 1.5 %, and RMSE is reduced by over 0.4. This study demonstrates that LSGN enables real-time high-precision soil mapping, providing a practical solution for precision seeding.
随着智慧农业的发展,精准变率播种需要高分辨率的土壤信息。然而,现有方法在实时生成局部土壤属性图方面仍存在不足,限制了大跨度播种机的行间播种控制。为了解决这一挑战,本文提出了基于Wasserstein梯度惩罚和均方误差生成对抗网络(WGAN-GPM)的局部土壤属性分布图生成网络(LSGN)。该模型将视觉变压器(ViT)编码器与卷积和反卷积残差网络(ResNet)相结合,显著提高了离散土壤属性数据的特征提取,提高了局部土壤图的重建精度。该模型采用两阶段训练策略。首先,它使用带有邻域信息的局部土壤属性图进行自监督预训练,使网络能够学习空间相关性并改进边界预测。其次,将预训练的权重转移到WGAN-GPM对抗训练框架中,该框架结合Wasserstein距离、梯度惩罚和均方误差(MSE)损失对生成器和鉴别器进行联合优化。这确保了像素级精度和分布一致性。与标准生成对抗网络(gan)相比,wgan - gpm训练的LSGN预测将相对误差平均(REA)降低了1.02%以上,RMSE降低了0.16以上,Kullback-Leibler散度(KLD)降低了1.76 × 10−4以上,同时将峰值信噪比(PSNR)提高了2.43以上。LSGN在4行、6行和8行种植宽度下的REA值分别为0.41%、1.19%和1.80%,优于传统的生成模型。现场交叉验证进一步表明,相对于Kriging插值,LSGN将边界区域预测误差降低了2.42%,将中心区域预测误差降低了0.97%。总体而言,局部分布图的REA降低了1.5%以上,RMSE降低了0.4以上。该研究表明,LSGN能够实现实时高精度土壤制图,为精确播种提供了实用的解决方案。
{"title":"Smart agriculture technology: Real-time generation method of local soil property distribution maps based on WGAN-GPM","authors":"Xiaoshuang Zhang ,&nbsp;Li Yang ,&nbsp;Dongxing Zhang ,&nbsp;Tao Cui ,&nbsp;Xiantao He ,&nbsp;Kailiang Zhang ,&nbsp;Zhaohui Du ,&nbsp;Zhimin Li ,&nbsp;Hongsheng Li ,&nbsp;Tianpu Xiao","doi":"10.1016/j.aiia.2025.10.014","DOIUrl":"10.1016/j.aiia.2025.10.014","url":null,"abstract":"<div><div>With the advancement of smart agriculture, precision variable-rate seeding requires high-resolution soil information. However, existing methods still fall short in generating localized soil property maps in real time, limiting inter-row seeding control for wide-span planters. To address this challenge, this paper proposes the Local Soil property distribution maps Generation Network (LSGN), based on the Wasserstein Generative Adversarial Network with Gradient Penalty and Mean squared error (WGAN-GPM). This proposed model integrates a Vision Transformer (ViT) encoder with convolutional and deconvolutional Residual Networks (ResNet), significantly improving feature extraction from discrete soil property data and enhancing the reconstruction accuracy of localized soil maps. The model adopts a two-stage training strategy. First, it performs self-supervised pre-training using local soil property maps with neighborhood information, enabling the network to learn spatial correlations and improve boundary prediction. Second, the pretrained weights are transferred into a WGAN-GPM adversarial training framework, which combines Wasserstein distance, gradient penalty, and Mean Squared Error (MSE) loss to jointly optimize the generator and discriminator. This ensures both pixel-level accuracy and distributional consistency. Compared to standard Generative Adversarial Networks (GANs), the WGAN-GPM-trained LSGN prediction reduces Relative Error Average (REA) by over 1.02 %, RMSE by over 0.16, and Kullback–Leibler Divergence (KLD) by over 1.76 × 10<sup>−4</sup>, while increasing Peak Signal-to-Noise Ratio (PSNR) by over 2.43. LSGN achieves REA values of 0.41 %, 1.19 %, and 1.80 % for planting widths of four, six, and eight rows, respectively, outperforming traditional generative models. Field cross-validation further shows that, relative to Kriging interpolation, LSGN reduces boundary region prediction errors by up to 2.42 % and central region errors by 0.97 %. Overall, the REA in local distribution maps decreases by more than 1.5 %, and RMSE is reduced by over 0.4. This study demonstrates that LSGN enables real-time high-precision soil mapping, providing a practical solution for precision seeding.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 300-315"},"PeriodicalIF":12.4,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved YOLOv8 for multi-colored apple fruit instance segmentation and 3D localization 改进的YOLOv8多色苹果水果实例分割和3D定位
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-31 DOI: 10.1016/j.aiia.2025.10.013
Jiaren Zhou, Mengyan Chen, Man Zhang, Zhao Zhang, Yao Zhang, Minjuan Wang
Robotic apple harvesting requires precise instance segmentation and 3D localization, especially for multi-colored apples under complex orchard conditions with occlusions and variable lighting. Current deep learning methods lack robustness and accuracy for such scenarios, limiting automation. This study proposes improved YOLOv8-based models to enhance segmentation of multi-colored apples, combined with a high-precision 3D localization pipeline to advance practical robotic harvesting. To address these issues, this study collected apple images in three colors from two locations, creating a dataset of 5171 images. Four enhanced YOLOv8-based models—RA-YOLO, GA-YOLO, YA-YOLO, and MCA-YOLO—were proposed for segmenting red, green, yellow, and mixed multi-colored apples. RA-YOLO integrates the GD mechanism and EMBConv structure based on EfficientNet's MBConv. GA-YOLO replaces standard convolutions with dynamic serpentine convolution and adds the P6 layer for large object detection. YA-YOLO utilizes deformable convolution (DCNv2) and introduces the new attention mechanism MPCA. MCA-YOLO combines the P6 layer, DCNv2, and EMBConv structure, merging the strengths of other models. RA-YOLO, GA-YOLO, and YA-YOLO achieved mAP values of 95.2 %, 96.4 %, and 95.4 %, respectively, for single-colored apple instance segmentation, surpassing baseline models and those in existing literature. MCA-YOLO achieved mAP values of 95.6 %, 96.6 %, and 94.6 % for single-colored apples and 95.6 % for mixed multi-colored apples. Ablation experiments validated the necessity of each module. Finally, a high-precision 3D localization and shaping pipeline was developed, achieving an average localization error of 2.636 mm and a shaping error of 0.768 mm, enabling millimeter-level localization and sub-millimeter-level shaping for apple harvesting optimization.
机器人收获苹果需要精确的实例分割和3D定位,特别是在复杂的果园条件下,具有遮挡和可变光照的多色苹果。目前的深度学习方法在这种情况下缺乏鲁棒性和准确性,限制了自动化。本研究提出了基于yolov8的改进模型来增强多色苹果的分割,并结合高精度三维定位管道来推进实际的机器人收获。为了解决这些问题,本研究从两个地点收集了三种颜色的苹果图像,创建了一个包含5171张图像的数据集。针对红、绿、黄和混合多色苹果,提出了4种基于yolov8的增强模型ra - yolo、GA-YOLO、YA-YOLO和mca - yolo。RA-YOLO基于EfficientNet的MBConv集成了GD机制和EMBConv结构。GA-YOLO用动态蛇形卷积取代标准卷积,并增加P6层用于大型目标检测。YA-YOLO利用了可变形卷积(DCNv2),并引入了新的注意机制MPCA。MCA-YOLO结合了P6层、DCNv2和EMBConv结构,融合了其他模型的优点。RA-YOLO、GA-YOLO和YA-YOLO对单色苹果实例分割的mAP值分别达到95.2%、96.4%和95.4%,优于基线模型和已有文献。MCA-YOLO对单色苹果的mAP值分别为95.6%、96.6%和94.6%,对混合多色苹果的mAP值分别为95.6%。烧蚀实验验证了各模块的必要性。最后,开发了高精度三维定位成形流水线,平均定位误差为2.636 mm,成形误差为0.768 mm,实现了毫米级定位和亚毫米级成形,实现了苹果收获优化。
{"title":"Improved YOLOv8 for multi-colored apple fruit instance segmentation and 3D localization","authors":"Jiaren Zhou,&nbsp;Mengyan Chen,&nbsp;Man Zhang,&nbsp;Zhao Zhang,&nbsp;Yao Zhang,&nbsp;Minjuan Wang","doi":"10.1016/j.aiia.2025.10.013","DOIUrl":"10.1016/j.aiia.2025.10.013","url":null,"abstract":"<div><div>Robotic apple harvesting requires precise instance segmentation and 3D localization, especially for multi-colored apples under complex orchard conditions with occlusions and variable lighting. Current deep learning methods lack robustness and accuracy for such scenarios, limiting automation. This study proposes improved YOLOv8-based models to enhance segmentation of multi-colored apples, combined with a high-precision 3D localization pipeline to advance practical robotic harvesting. To address these issues, this study collected apple images in three colors from two locations, creating a dataset of 5171 images. Four enhanced YOLOv8-based models—RA-YOLO, GA-YOLO, YA-YOLO, and MCA-YOLO—were proposed for segmenting red, green, yellow, and mixed multi-colored apples. RA-YOLO integrates the GD mechanism and EMBConv structure based on EfficientNet's MBConv. GA-YOLO replaces standard convolutions with dynamic serpentine convolution and adds the P6 layer for large object detection. YA-YOLO utilizes deformable convolution (DCNv2) and introduces the new attention mechanism MPCA. MCA-YOLO combines the P6 layer, DCNv2, and EMBConv structure, merging the strengths of other models. RA-YOLO, GA-YOLO, and YA-YOLO achieved mAP values of 95.2 %, 96.4 %, and 95.4 %, respectively, for single-colored apple instance segmentation, surpassing baseline models and those in existing literature. MCA-YOLO achieved mAP values of 95.6 %, 96.6 %, and 94.6 % for single-colored apples and 95.6 % for mixed multi-colored apples. Ablation experiments validated the necessity of each module. Finally, a high-precision 3D localization and shaping pipeline was developed, achieving an average localization error of 2.636 mm and a shaping error of 0.768 mm, enabling millimeter-level localization and sub-millimeter-level shaping for apple harvesting optimization.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 381-396"},"PeriodicalIF":12.4,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145527910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PlaneSegNet: A deep learning network with plane attention for plant point cloud segmentation in agricultural environments PlaneSegNet:一种具有平面关注的深度学习网络,用于农业环境下植物点云分割
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-30 DOI: 10.1016/j.aiia.2025.10.015
Xin Yang , Chenyi Xu , Yan Wang, Ruixia Feng, Jinshi Yu, Zichen Su, Teng Miao, Tongyu Xu
Accurately extracting plant point clouds from complex agricultural environments is essential for high-throughput phenotyping in smart farming. However, existing methods face significant challenges when processing large-scale agricultural point clouds owing to high noise levels, dense spatial distribution, and blurred structural boundaries between plant and non-plant regions. To address these issues, this study proposes PlaneSegNet, a voxel-based semantic segmentation network that incorporates an innovative plane attention module. This module aggregates projection features from the XZ and YZ planes, enhancing the model's ability to detect vertical geometric variations and thereby improving segmentation performance in boundary regions. Extensive experiments across representative agricultural scenarios at multiple scales, including open-field populations, greenhouse cultivation environments, and large-scale rural landscapes, demonstrate that PlaneSegNet significantly outperforms traditional geometry-based approaches and deep-learning models in plant and non-plant separation. By directly generating high-quality plant-only point clouds, PlaneSegNet significantly reduces reliance on manual pre-processing, offering a practical and generalisable solution for automated plant extraction across a wide range of agricultural applications. The dataset and source code used in this study are publicly available at https://github.com/yangxin6/PlaneSegNet.
从复杂的农业环境中准确提取植物点云对于智能农业的高通量表型分析至关重要。然而,现有的方法在处理大规模农业点云时面临着巨大的挑战,因为噪声水平高,空间分布密集,植物和非植物区域之间的结构边界模糊。为了解决这些问题,本研究提出了PlaneSegNet,这是一个基于体素的语义分割网络,包含了一个创新的平面注意力模块。该模块聚合了XZ和YZ平面的投影特征,增强了模型检测垂直几何变化的能力,从而提高了边界区域的分割性能。在多个尺度的代表性农业场景中进行的广泛实验,包括露天人群、温室栽培环境和大规模农村景观,表明PlaneSegNet在植物和非植物分离方面明显优于传统的基于几何的方法和深度学习模型。PlaneSegNet通过直接生成高质量的植物点云,大大减少了对人工预处理的依赖,为广泛的农业应用中的自动化植物提取提供了实用和通用的解决方案。本研究使用的数据集和源代码可在https://github.com/yangxin6/PlaneSegNet上公开获取。
{"title":"PlaneSegNet: A deep learning network with plane attention for plant point cloud segmentation in agricultural environments","authors":"Xin Yang ,&nbsp;Chenyi Xu ,&nbsp;Yan Wang,&nbsp;Ruixia Feng,&nbsp;Jinshi Yu,&nbsp;Zichen Su,&nbsp;Teng Miao,&nbsp;Tongyu Xu","doi":"10.1016/j.aiia.2025.10.015","DOIUrl":"10.1016/j.aiia.2025.10.015","url":null,"abstract":"<div><div>Accurately extracting plant point clouds from complex agricultural environments is essential for high-throughput phenotyping in smart farming. However, existing methods face significant challenges when processing large-scale agricultural point clouds owing to high noise levels, dense spatial distribution, and blurred structural boundaries between plant and non-plant regions. To address these issues, this study proposes PlaneSegNet, a voxel-based semantic segmentation network that incorporates an innovative plane attention module. This module aggregates projection features from the XZ and YZ planes, enhancing the model's ability to detect vertical geometric variations and thereby improving segmentation performance in boundary regions. Extensive experiments across representative agricultural scenarios at multiple scales, including open-field populations, greenhouse cultivation environments, and large-scale rural landscapes, demonstrate that PlaneSegNet significantly outperforms traditional geometry-based approaches and deep-learning models in plant and non-plant separation. By directly generating high-quality plant-only point clouds, PlaneSegNet significantly reduces reliance on manual pre-processing, offering a practical and generalisable solution for automated plant extraction across a wide range of agricultural applications. The dataset and source code used in this study are publicly available at <span><span>https://github.com/yangxin6/PlaneSegNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 284-299"},"PeriodicalIF":12.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing interpretable machine learning algorithms and multiple features from multi-temporal Sentinel-2 imagery for predicting wheat fusarium head blight 利用可解释机器学习算法和多时相Sentinel-2图像的多种特征预测小麦赤霉病
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-27 DOI: 10.1016/j.aiia.2025.10.012
Hui Wang , Chao Ruan , Jinling Zhao , Yunran Wang , Ying Li , Yingying Dong , Linsheng Huang
Wheat Fusarium head blight (FHB) severely affects wheat yields, and predicting its occurrence and spatial distribution is essential for safeguarding crop production. This study presents an interpretable machine learning method designed to predict FHB by leveraging multi-temporal and multi-feature information obtained from Sentinel-2 imagery. During the regreening and grain-filling stages, we extracted vegetation indices (VIs), texture features (TFs), and color indices (CIs). Single-temporal features were derived from the grain-filling stage, while multi-temporal features combined data from grain-filling and regreening stages. The synthetic minority over-sampling technique (SMOTE) was employed to adjust the class imbalance, while the most significant characteristics were found using the sequential forward selection (SFS) approach. The extreme gradient boosting (XGBoost) model, optimized using simulated annealing (SA) algorithm and explained via SHapley Additive exPlanation (SHAP) method, integrated VIs, TFs, and CIs as input features. The presented model demonstrated exceptional results, achieving a prediction accuracy of 89.9 % across multi-temporal and a Kappa coefficient of 0.797. It outperformed random forests (RF), backpropagation neural networks (BPNN), and support vector machines (SVM) model. This study indicates that an interpretable machine learning approach, which utilizes both multi-temporal and multi-feature data, is effective in forecasting FHB, thereby providing a valuable tool for agricultural management and disease prevention strategies.
小麦赤霉病(FHB)严重影响小麦产量,预测其发生和空间分布对保障作物生产至关重要。本研究提出了一种可解释的机器学习方法,旨在利用从Sentinel-2图像中获得的多时相和多特征信息来预测FHB。在复绿和灌浆阶段,提取植被指数(VIs)、纹理特征(tf)和颜色指数(CIs)。单时间特征来源于籽粒灌浆阶段,多时间特征来源于籽粒灌浆和返青阶段。采用合成少数过采样技术(SMOTE)来调整类不平衡,而采用顺序正向选择(SFS)方法发现最显著的特征。极端梯度增强(XGBoost)模型采用模拟退火(SA)算法进行优化,并通过SHapley加性解释(SHAP)方法进行解释,将VIs、tf和CIs作为输入特征。该模型在多时段的预测精度为89.9%,Kappa系数为0.797。它优于随机森林(RF)、反向传播神经网络(BPNN)和支持向量机(SVM)模型。该研究表明,利用多时相和多特征数据的可解释机器学习方法可以有效地预测FHB,从而为农业管理和疾病预防策略提供有价值的工具。
{"title":"Utilizing interpretable machine learning algorithms and multiple features from multi-temporal Sentinel-2 imagery for predicting wheat fusarium head blight","authors":"Hui Wang ,&nbsp;Chao Ruan ,&nbsp;Jinling Zhao ,&nbsp;Yunran Wang ,&nbsp;Ying Li ,&nbsp;Yingying Dong ,&nbsp;Linsheng Huang","doi":"10.1016/j.aiia.2025.10.012","DOIUrl":"10.1016/j.aiia.2025.10.012","url":null,"abstract":"<div><div>Wheat <em>Fusarium</em> head blight (FHB) severely affects wheat yields, and predicting its occurrence and spatial distribution is essential for safeguarding crop production. This study presents an interpretable machine learning method designed to predict FHB by leveraging multi-temporal and multi-feature information obtained from Sentinel-2 imagery. During the regreening and grain-filling stages, we extracted vegetation indices (VIs), texture features (TFs), and color indices (CIs). Single-temporal features were derived from the grain-filling stage, while multi-temporal features combined data from grain-filling and regreening stages. The synthetic minority over-sampling technique (SMOTE) was employed to adjust the class imbalance, while the most significant characteristics were found using the sequential forward selection (SFS) approach. The extreme gradient boosting (XGBoost) model, optimized using simulated annealing (SA) algorithm and explained via SHapley Additive exPlanation (SHAP) method, integrated VIs, TFs, and CIs as input features. The presented model demonstrated exceptional results, achieving a prediction accuracy of 89.9 % across multi-temporal and a Kappa coefficient of 0.797. It outperformed random forests (RF), backpropagation neural networks (BPNN), and support vector machines (SVM) model. This study indicates that an interpretable machine learning approach, which utilizes both multi-temporal and multi-feature data, is effective in forecasting FHB, thereby providing a valuable tool for agricultural management and disease prevention strategies.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 224-239"},"PeriodicalIF":12.4,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inversion of plant functional traits from hyperspectral imagery enhances the distinction of wheat stripe rust severity 利用高光谱影像对植物功能性状进行反演,增强了小麦条锈病严重程度的区分
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-26 DOI: 10.1016/j.aiia.2025.10.006
Kehui Ren , Anting Guo , Binxiang Qian , Chao Ruan , Wenjiang Huang , Yingying Dong , Xia Jing , Kun Wang , Tiecheng Huang , Huiqin Ma
Wheat stripe rust can cause yield losses of up to 40 % during severe outbreaks, underscoring the importance of timely and accurate detection for effective management. Traditional hyperspectral methods relying on vegetation indices (VIs) and texture features (TFs) offer indirect assessments prone to environmental interference. In contrast, plant functional traits (PTs) supply a more consistent and informative reflection of stress progression. Accurately quantifying PTs, alongside traditional VIs and TFs, is therefore essential for constructing a comprehensive and robust disease monitoring framework. First, this study utilized correlation coefficient analysis and variance inflation factor (VIF) analysis to filter conventional variables (VIs and TFs) from candidate feature sets. The hybrid inversion model (HIM) was then employed to retrieve critical PTs—CCC, Car, Anth, CBC, and LAI—from UAV hyperspectral data, while canopy temperature (Tc) was derived from thermal imagery. Subsequently, 28 disease monitoring models were developed using machine learning algorithms (RF, AdaBoost, GBRT, and LASSO), incorporating both individual and combined features. Model performance was rigorously assessed through 6-fold cross-validation. The results demonstrated that PTs significantly responded to wheat rust disease (p-value < 0.001), manifesting as reductions in pigment content (CCC, Car, Anth) and structural parameters (LAI), along with increases in CBC and Tc. During model development, PTs exhibited superior performance over VIs and TFs, given their strong association with plant health and sensitivity to biotic stress. Moreover, the sparsity of LASSO, combined with synergistic integration of multiple feature types, substantially enhanced model accuracy. The optimal model, integrating all three feature categories via LASSO regression, yielded R2, RMSE, and MAE values of 0.628, 8.03 %, and 6.57 %, respectively. Overall, this study advances the accuracy of wheat stripe rust monitoring by integrating PTs, VIs, and TFs to create a physiological-spectral–morphological synergy, providing valuable insights for large-scale disease detection on airborne and satellite platforms.
在严重暴发期间,小麦条锈病可造成高达40%的产量损失,这强调了及时和准确发现对有效管理的重要性。传统的高光谱方法依赖于植被指数(VIs)和纹理特征(tf)进行间接评估,容易受到环境干扰。相比之下,植物功能性状(PTs)提供了更一致和信息丰富的胁迫进展的反映。因此,准确量化pt,以及传统的vi和tf,对于构建一个全面而稳健的疾病监测框架至关重要。首先,本研究利用相关系数分析和方差膨胀因子(VIF)分析从候选特征集中过滤常规变量(VIs和tf)。然后利用混合反演模型(HIM)从无人机高光谱数据中检索关键PTs-CCC、Car、Anth、CBC和lai,同时从热像图中获取冠层温度(Tc)。随后,利用机器学习算法(RF、AdaBoost、GBRT和LASSO)开发了28个疾病监测模型,包括个体和组合特征。通过6倍交叉验证严格评估模型性能。结果表明,PTs对小麦锈病反应显著(p值<; 0.001),表现为色素含量(CCC、Car、Anth)和结构参数(LAI)降低,CBC和Tc增加。在模型发育过程中,PTs表现出优于VIs和TFs的性能,因为它们与植物健康和对生物胁迫的敏感性密切相关。此外,LASSO的稀疏性与多种特征类型的协同集成相结合,大大提高了模型的准确性。最优模型通过LASSO回归整合了所有三个特征类别,其R2、RMSE和MAE值分别为0.628%、8.03%和6.57%。总体而言,本研究通过整合PTs、VIs和TFs来提高小麦条锈病监测的准确性,从而创建生理-光谱-形态学协同作用,为机载和卫星平台上的大规模疾病检测提供有价值的见解。
{"title":"Inversion of plant functional traits from hyperspectral imagery enhances the distinction of wheat stripe rust severity","authors":"Kehui Ren ,&nbsp;Anting Guo ,&nbsp;Binxiang Qian ,&nbsp;Chao Ruan ,&nbsp;Wenjiang Huang ,&nbsp;Yingying Dong ,&nbsp;Xia Jing ,&nbsp;Kun Wang ,&nbsp;Tiecheng Huang ,&nbsp;Huiqin Ma","doi":"10.1016/j.aiia.2025.10.006","DOIUrl":"10.1016/j.aiia.2025.10.006","url":null,"abstract":"<div><div>Wheat stripe rust can cause yield losses of up to 40 % during severe outbreaks, underscoring the importance of timely and accurate detection for effective management. Traditional hyperspectral methods relying on vegetation indices (VIs) and texture features (TFs) offer indirect assessments prone to environmental interference. In contrast, plant functional traits (PTs) supply a more consistent and informative reflection of stress progression. Accurately quantifying PTs, alongside traditional VIs and TFs, is therefore essential for constructing a comprehensive and robust disease monitoring framework. First, this study utilized correlation coefficient analysis and variance inflation factor (VIF) analysis to filter conventional variables (VIs and TFs) from candidate feature sets. The hybrid inversion model (HIM) was then employed to retrieve critical PTs—CCC, Car, Anth, CBC, and LAI—from UAV hyperspectral data, while canopy temperature (T<sub>c</sub>) was derived from thermal imagery. Subsequently, 28 disease monitoring models were developed using machine learning algorithms (RF, AdaBoost, GBRT, and LASSO), incorporating both individual and combined features. Model performance was rigorously assessed through 6-fold cross-validation. The results demonstrated that PTs significantly responded to wheat rust disease (<em>p-value &lt; 0.001</em>), manifesting as reductions in pigment content (CCC, Car, Anth) and structural parameters (LAI), along with increases in CBC and T<sub>c</sub>. During model development, PTs exhibited superior performance over VIs and TFs, given their strong association with plant health and sensitivity to biotic stress. Moreover, the sparsity of LASSO, combined with synergistic integration of multiple feature types, substantially enhanced model accuracy. The optimal model, integrating all three feature categories via LASSO regression, yielded R<sup>2</sup>, RMSE, and MAE values of 0.628, 8.03 %, and 6.57 %, respectively. Overall, this study advances the accuracy of wheat stripe rust monitoring by integrating PTs, VIs, and TFs to create a physiological-spectral–morphological synergy, providing valuable insights for large-scale disease detection on airborne and satellite platforms.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 206-223"},"PeriodicalIF":12.4,"publicationDate":"2025-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing UAV-based wheat phenology monitoring: A dual-mode framework integrating time-series reconstruction, noise augmentation, and deep learning for robust BBCH estimation 推进基于无人机的小麦物候监测:一种集成时间序列重建、噪声增强和深度学习的双模式框架,用于鲁棒BBCH估计
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-25 DOI: 10.1016/j.aiia.2025.10.008
Ziheng Feng , Ziya Zhao , Liunan Suo , Huiling Long , Hao Yang , Xiaoyu Song , Haikuan Feng , Bo Xu , Xinming Ma , Wei Feng
Precise monitoring of wheat phenology (BBCH scale) is essential for agricultural optimization, yet UAV-based single-phase monitoring encounters spectral ambiguities where multiple vegetation indices correspond to identical growth stages. A dual-mode framework integrating time-series reconstruction with hybrid deep learning was developed to resolve this limitation. UAV multispectral and digital imagery (333 plots, 2023–2024) enabled reconstruction of daily-resolved vegetation indices, color/texture features, and BBCH stages using Gaussian, PCHIP, and linear fitting to mitigate environmental noise. Synthetic datasets incorporating Gaussian noise (5–100 % relative intensity) simulated field variability. Feature selection was optimized through Competitive Adaptive Reweighted Sampling (CARS) and Variance Inflation Factor (VIF). Hybrid CNN-GRU and CNN-LSTM architectures surpassed standalone networks by resolving spectral ambiguities in single-phase data and leveraging temporal patterns during time-series analysis. Time-series models attained maximum accuracy under noise-free conditions (CNN-GRU: R2 = 0.90–0.98, RMSE = 3.61–7.65 BBCH units), with accuracy decreasing proportionally to noise intensity. Conversely, single-phase models demonstrated peak performance at 20 % noise intensity (CNN-GRU: R2 = 0.56–0.70, RMSE = 15.33–17.22 BBCH units), achieving optimal balance between robustness and practicality for real-time farm monitoring. Extreme noise (100 %) distorted feature distributions (7.25–8.73× expansion), validating controlled augmentation. A novel Rate of Phenological Development (RPDW)—quantified as the slope of BBCH progression—was derived to inform breeding programs, while the noise-optimized single-phase approach enables resource-efficient phenology tracking for family farms. This work bridges methodological innovation (adaptive noise strategies, hybrid architectures) with scalable solutions for precision agriculture, advancing UAV-based phenology monitoring in both academic and applied contexts.
小麦物候的精确监测(BBCH尺度)对于农业优化至关重要,然而基于无人机的单相监测遇到了光谱模糊,多个植被指数对应于相同的生长阶段。为了解决这一问题,开发了一种将时间序列重构与混合深度学习相结合的双模框架。无人机多光谱和数字图像(333个地块,2023-2024年)使用高斯、PCHIP和线性拟合方法重建日分辨植被指数、颜色/纹理特征和BBCH阶段,以减轻环境噪声。包含高斯噪声(5 - 100%相对强度)的合成数据集模拟了场的变化。通过竞争自适应加权抽样(CARS)和方差膨胀因子(VIF)对特征选择进行优化。混合CNN-GRU和CNN-LSTM架构通过解决单相数据中的频谱模糊性和在时间序列分析中利用时间模式超越了独立网络。时间序列模型在无噪声条件下精度最高(CNN-GRU: R2 = 0.90-0.98, RMSE = 3.61-7.65 BBCH单位),精度随噪声强度成比例降低。相反,单相模型在20%噪声强度下表现最佳(CNN-GRU: R2 = 0.56-0.70, RMSE = 15.33-17.22 BBCH单位),实现了实时农场监测的鲁棒性和实用性之间的最佳平衡。极端噪声(100%)扭曲特征分布(7.25 - 8.73倍扩展),验证受控增强。一个新的物候发育速率(RPDW)——量化为BBCH进展的斜率——被推导出来,为育种计划提供信息,而噪声优化的单相方法可以为家庭农场提供资源高效的物候跟踪。这项工作将方法创新(自适应噪声策略,混合架构)与精确农业的可扩展解决方案联系起来,在学术和应用环境中推进基于无人机的物候监测。
{"title":"Advancing UAV-based wheat phenology monitoring: A dual-mode framework integrating time-series reconstruction, noise augmentation, and deep learning for robust BBCH estimation","authors":"Ziheng Feng ,&nbsp;Ziya Zhao ,&nbsp;Liunan Suo ,&nbsp;Huiling Long ,&nbsp;Hao Yang ,&nbsp;Xiaoyu Song ,&nbsp;Haikuan Feng ,&nbsp;Bo Xu ,&nbsp;Xinming Ma ,&nbsp;Wei Feng","doi":"10.1016/j.aiia.2025.10.008","DOIUrl":"10.1016/j.aiia.2025.10.008","url":null,"abstract":"<div><div>Precise monitoring of wheat phenology (BBCH scale) is essential for agricultural optimization, yet UAV-based single-phase monitoring encounters spectral ambiguities where multiple vegetation indices correspond to identical growth stages. A dual-mode framework integrating time-series reconstruction with hybrid deep learning was developed to resolve this limitation. UAV multispectral and digital imagery (333 plots, 2023–2024) enabled reconstruction of daily-resolved vegetation indices, color/texture features, and BBCH stages using Gaussian, PCHIP, and linear fitting to mitigate environmental noise. Synthetic datasets incorporating Gaussian noise (5–100 % relative intensity) simulated field variability. Feature selection was optimized through Competitive Adaptive Reweighted Sampling (CARS) and Variance Inflation Factor (VIF). Hybrid CNN-GRU and CNN-LSTM architectures surpassed standalone networks by resolving spectral ambiguities in single-phase data and leveraging temporal patterns during time-series analysis. Time-series models attained maximum accuracy under noise-free conditions (CNN-GRU: R<sup>2</sup> = 0.90–0.98, RMSE = 3.61–7.65 BBCH units), with accuracy decreasing proportionally to noise intensity. Conversely, single-phase models demonstrated peak performance at 20 % noise intensity (CNN-GRU: R<sup>2</sup> = 0.56–0.70, RMSE = 15.33–17.22 BBCH units), achieving optimal balance between robustness and practicality for real-time farm monitoring. Extreme noise (100 %) distorted feature distributions (7.25–8.73× expansion), validating controlled augmentation. A novel <em>Rate of Phenological Development (RPDW)</em>—quantified as the slope of BBCH progression—was derived to inform breeding programs, while the noise-optimized single-phase approach enables resource-efficient phenology tracking for family farms. This work bridges methodological innovation (adaptive noise strategies, hybrid architectures) with scalable solutions for precision agriculture, advancing UAV-based phenology monitoring in both academic and applied contexts.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 240-265"},"PeriodicalIF":12.4,"publicationDate":"2025-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lightweight model based on knowledge distillation for free-range chickens detection in complex commercial farming environments 复杂商业养殖环境下基于知识蒸馏的散养鸡检测轻量级模型
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-24 DOI: 10.1016/j.aiia.2025.10.010
Xiaoxin Li , Mingrui Cai , Zhen Liu , Chengcheng Yin , Xinjie Tan , Jiangtao Wen , Yuxing Han
Side-view imaging for monitoring free-range chickens on edge devices faces significant challenges due to complex backgrounds, occlusions, and limited computational resources, which particularly affect the performance of lightweight models in terms of their representational capacity and generalization ability. To address these limitations, this study proposes a Lightweight Free-range Chickens Detection Model based on YOLOv8n and knowledge distillation (LCD-YOLOv8n-KD), establishing an optimal balance between detection performance and model efficiency. The YOLOv8n architecture is enhanced by incorporating DualConv, CCFF, PCC2f, and SAHead modules to create LCD-YOLOv8n, significantly reducing model parameters and computational complexity. Further improvement is achieved through knowledge distillation, where a pre-trained large-scale model developed by our team as the teacher network, and LCD-YOLOv8n functioned as the student network, resulting in the LCD-YOLOv8n-KD model. Experimental validation is conducted using a comprehensive dataset comprising 6000 images with 162,864 labeled chicken targets, collected from various side-view angles in commercial farming environments. LCD-YOLOv8n-KD achieves AP50 values of 95.9 %, 90.2 %, 82.7 %, and 69.3 % on the test set and three independent test sets, respectively. Compared to the original YOLOv8n, the proposed model demonstrates a 16.13 % improvement in AP50 while reducing parameters by 47.84 % and GFLOPs by 41.46 %. The proposed model outperforms other state-of-the-art lightweight models in terms of detection efficiency, accuracy, and generalization capability, demonstrating strong potential for practical deployment in real-world free-range chicken farming environments.
由于复杂的背景、遮挡和有限的计算资源,在边缘设备上监测散养鸡的侧视图成像面临着巨大的挑战,这尤其影响了轻量级模型的表现能力和泛化能力。为了解决这些问题,本研究提出了一种基于YOLOv8n和知识蒸馏(LCD-YOLOv8n-KD)的轻型散养鸡检测模型,在检测性能和模型效率之间建立了最佳平衡。YOLOv8n架构通过合并DualConv, CCFF, PCC2f和SAHead模块来增强LCD-YOLOv8n,显着降低了模型参数和计算复杂度。通过知识蒸馏进一步改进,我们团队开发了一个预训练的大规模模型作为教师网络,LCD-YOLOv8n作为学生网络,得到LCD-YOLOv8n- kd模型。实验验证使用了一个综合数据集,该数据集包含6000幅图像,其中包含162,864个标记鸡目标,这些图像是从商业养殖环境的各个侧面角度收集的。LCD-YOLOv8n-KD在测试集和三个独立测试集上的AP50值分别为95.9%、90.2%、82.7%和69.3%。与原始的YOLOv8n相比,该模型的AP50性能提高了16.13%,参数降低了47.84%,GFLOPs降低了41.46%。所提出的模型在检测效率、准确性和泛化能力方面优于其他最先进的轻量级模型,在现实世界的散养鸡养殖环境中显示出强大的实际部署潜力。
{"title":"A lightweight model based on knowledge distillation for free-range chickens detection in complex commercial farming environments","authors":"Xiaoxin Li ,&nbsp;Mingrui Cai ,&nbsp;Zhen Liu ,&nbsp;Chengcheng Yin ,&nbsp;Xinjie Tan ,&nbsp;Jiangtao Wen ,&nbsp;Yuxing Han","doi":"10.1016/j.aiia.2025.10.010","DOIUrl":"10.1016/j.aiia.2025.10.010","url":null,"abstract":"<div><div>Side-view imaging for monitoring free-range chickens on edge devices faces significant challenges due to complex backgrounds, occlusions, and limited computational resources, which particularly affect the performance of lightweight models in terms of their representational capacity and generalization ability. To address these limitations, this study proposes a Lightweight Free-range Chickens Detection Model based on YOLOv8n and knowledge distillation (LCD-YOLOv8n-KD), establishing an optimal balance between detection performance and model efficiency. The YOLOv8n architecture is enhanced by incorporating DualConv, CCFF, PCC2f, and SAHead modules to create LCD-YOLOv8n, significantly reducing model parameters and computational complexity. Further improvement is achieved through knowledge distillation, where a pre-trained large-scale model developed by our team as the teacher network, and LCD-YOLOv8n functioned as the student network, resulting in the LCD-YOLOv8n-KD model. Experimental validation is conducted using a comprehensive dataset comprising 6000 images with 162,864 labeled chicken targets, collected from various side-view angles in commercial farming environments. LCD-YOLOv8n-KD achieves AP<sub>50</sub> values of 95.9 %, 90.2 %, 82.7 %, and 69.3 % on the test set and three independent test sets, respectively. Compared to the original YOLOv8n, the proposed model demonstrates a 16.13 % improvement in AP<sub>50</sub> while reducing parameters by 47.84 % and GFLOPs by 41.46 %. The proposed model outperforms other state-of-the-art lightweight models in terms of detection efficiency, accuracy, and generalization capability, demonstrating strong potential for practical deployment in real-world free-range chicken farming environments.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 266-283"},"PeriodicalIF":12.4,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STGMAE: A GNSS data-driven pre-training spatiotemporal graph masked autoencoder for agricultural machinery trajectory operation mode identification STGMAE: GNSS数据驱动的预训练时空图掩码自编码器,用于农业机械轨迹运行模式识别
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-24 DOI: 10.1016/j.aiia.2025.10.007
Tailai Chen , Weixin Zhai
Utilizing spatiotemporal features in massive amounts of trajectory data to identify the operation mode of agricultural machinery trajectories is a key task in precision agriculture. Most of the previous studies focuses narrowly on single-perspective feature extraction, neglecting comprehensive spatiotemporal information in trajectory data. To improve the accuracy of the task, this paper proposes a model called STGMAE. First, we propose a multilevel feature extraction method (MFE), which extracts motion and statistical features from the initial features via a motion feature extractor and a sliding time window, and then uses the spectral feature module (SFM) to capture the spectral information, which improves the representation of trajectory data. Next, to prevent information loss in long-range encoding, we design a pre-training network with serial encoding and parallel decoding. Specifically, the data are first modeled globally interactively via a multiscale wavelet projector (WMP), and then enter an adaptive graph isomorphic neural network (AGIN). In AGIN, semi-adaptive masked Laplace operator (SAMLO) is used to capture the correlation information between trajectory points, and then a passing mechanism is used to address the homogeneous relationships between trajectory points and heterogeneous relationships between trajectory graphs. Then, the original feature and graph structure are reconstructed from the two encoding nodes to realize self-supervised training. Eventually, we use the pre-trained weights on the real trajectory samples provided by the Key Laboratory of Agricultural Machinery Monitoring and Big Data Application, Ministry of Agriculture and Rural Affairs, which contains 219 trajectory samples (2,250,693 trajectory points). The experimental results show that for the paddy, corn, and wheat harvesting trajectory datasets, our model accuracies are 95.50 %, 95.32 %, and 95.36 %, respectively, and the F1 scores are 94.54 %, 92.09 %, and 93.79 %, respectively. Compared with existing state-of-the-art methods, our method achieves accuracies of 5.75 %, 4.47 %, and 5.03 % and F1 scores of 7.26 %, 4.85 %, and 6.65 %, respectively.
利用海量轨迹数据中的时空特征识别农机轨迹运行模式是精准农业的关键任务。以往的研究大多局限于单视角特征提取,忽略了轨迹数据中综合的时空信息。为了提高任务的准确性,本文提出了一种STGMAE模型。首先,我们提出了一种多层特征提取方法(MFE),该方法通过运动特征提取器和滑动时间窗从初始特征中提取运动特征和统计特征,然后使用光谱特征模块(SFM)捕获光谱信息,提高了轨迹数据的表征能力。其次,为了防止远程编码中的信息丢失,我们设计了一个串行编码和并行解码的预训练网络。具体而言,首先通过多尺度小波投影(WMP)对数据进行全局交互建模,然后输入自适应图同构神经网络(AGIN)。在AGIN中,采用半自适应掩模拉普拉斯算子(SAMLO)捕获轨迹点之间的相关信息,然后采用传递机制处理轨迹点之间的同质关系和轨迹图之间的异质关系。然后,从两个编码节点重构原始特征和图结构,实现自监督训练。最终,我们将预训练好的权值用于农业农村部农业机械监测与大数据应用重点实验室提供的真实轨迹样本,该样本包含219个轨迹样本(2,250,693个轨迹点)。实验结果表明,对于水稻、玉米和小麦的收获轨迹数据集,我们的模型准确率分别为95.50%、95.32%和95.36%,F1得分分别为94.54%、92.09%和93.79%。与现有最先进的方法相比,该方法的准确率分别为5.75%、4.47%和5.03%,F1分数分别为7.26%、4.85%和6.65%。
{"title":"STGMAE: A GNSS data-driven pre-training spatiotemporal graph masked autoencoder for agricultural machinery trajectory operation mode identification","authors":"Tailai Chen ,&nbsp;Weixin Zhai","doi":"10.1016/j.aiia.2025.10.007","DOIUrl":"10.1016/j.aiia.2025.10.007","url":null,"abstract":"<div><div>Utilizing spatiotemporal features in massive amounts of trajectory data to identify the operation mode of agricultural machinery trajectories is a key task in precision agriculture. Most of the previous studies focuses narrowly on single-perspective feature extraction, neglecting comprehensive spatiotemporal information in trajectory data. To improve the accuracy of the task, this paper proposes a model called STGMAE. First, we propose a multilevel feature extraction method (MFE), which extracts motion and statistical features from the initial features via a motion feature extractor and a sliding time window, and then uses the spectral feature module (SFM) to capture the spectral information, which improves the representation of trajectory data. Next, to prevent information loss in long-range encoding, we design a pre-training network with serial encoding and parallel decoding. Specifically, the data are first modeled globally interactively via a multiscale wavelet projector (WMP), and then enter an adaptive graph isomorphic neural network (AGIN). In AGIN, semi-adaptive masked Laplace operator (SAMLO) is used to capture the correlation information between trajectory points, and then a passing mechanism is used to address the homogeneous relationships between trajectory points and heterogeneous relationships between trajectory graphs. Then, the original feature and graph structure are reconstructed from the two encoding nodes to realize self-supervised training. Eventually, we use the pre-trained weights on the real trajectory samples provided by the Key Laboratory of Agricultural Machinery Monitoring and Big Data Application, Ministry of Agriculture and Rural Affairs, which contains 219 trajectory samples (2,250,693 trajectory points). The experimental results show that for the paddy, corn, and wheat harvesting trajectory datasets, our model accuracies are 95.50 %, 95.32 %, and 95.36 %, respectively, and the F1 scores are 94.54 %, 92.09 %, and 93.79 %, respectively. Compared with existing state-of-the-art methods, our method achieves accuracies of 5.75 %, 4.47 %, and 5.03 % and F1 scores of 7.26 %, 4.85 %, and 6.65 %, respectively.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 439-457"},"PeriodicalIF":12.4,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145527922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lightweight keypoint detection model-based method for strawberry recognition and picking point localization in multi-occlusion scenes 基于轻量级关键点检测模型的多遮挡场景草莓识别与采摘点定位方法
IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY Pub Date : 2025-10-24 DOI: 10.1016/j.aiia.2025.10.009
Dezhi Wang, Xiaochan Wang, Yinyan Shi, Xiaolei Zhang, Yanyu Chen, Jinming Zheng, Nan Liu
Strawberries grown on elevated stands usually suffer from fruit occlusion issues, which severely limit the implementation of strawberry recognition and picking point localization, and the embedded devices carried by strawberry picking robots have high requirements for model lightweighting, posing a dual challenge to the efficient execution of automated picking tasks by robots. To address this issue, this study proposes a method for strawberry recognition and picking point localization in multi-occlusion scenes based on a lightweight keypoint detection model. Firstly, a strawberry dataset covering no, slight, moderate, and heavy occlusion scenes is constructed. Then, a lightweight strawberry recognition and keypoint detection network, LS-net, is proposed. LS-net improves the spatial relationship modelling capability between strawberries and stems by integrating the lightweight MobileNetv4 backbone with the Mobile Grouped-Query Attention mechanism; improves the feature pyramid network using depthwise separable convolutions and incorporates an anchor-free decoupled head network to reduce computational complexity while maintaining detection accuracy; and introduces the Matrix Non-Maximum Suppression to optimize the processing of overlapping strawberries, which effectively reduces the false negative detections. Based on the keypoint detection results from LS-net, the picking point coordinates and stem pose are calculated after a series of processes such as region-of-interest extraction, binarization, and depth data alignment. The experimental results show that the accuracy of LS-net is 91.07 %, the mean average precision is 93.93 %, and the average pixel Euclidean distance is 4.79. By deploying LS-net to the embedded device, its frames per second reaches 78.2, and the success rates of 3D picking point localization and stem pose estimation are 84.07 % and 81.32 %, respectively. LS-net and related methods provide a visual recognition solution adapted to embedded devices for strawberry picking robots.
种植在高架架上的草莓通常存在果实遮挡问题,严重限制了草莓识别和采摘点定位的实现,并且草莓采摘机器人携带的嵌入式设备对模型轻量化要求很高,这对机器人高效执行自动化采摘任务提出了双重挑战。针对这一问题,本研究提出了一种基于轻量级关键点检测模型的多遮挡场景下草莓识别和采摘点定位方法。首先,构建草莓数据集,包括无、轻微、中度和重度遮挡场景。然后,提出了一种轻量级的草莓识别和关键点检测网络LS-net。LS-net通过集成轻量级MobileNetv4骨干网和移动分组查询关注机制,提高了草莓与茎的空间关系建模能力;使用深度可分离卷积改进特征金字塔网络,并结合无锚解耦头部网络,在保持检测精度的同时降低计算复杂度;引入矩阵非最大值抑制优化重叠草莓的处理,有效降低了假阴性检测。基于LS-net的关键点检测结果,经过感兴趣区域提取、二值化和深度数据对齐等一系列处理,计算提取点坐标和茎位姿。实验结果表明,LS-net的精度为91.07%,平均精度为93.93%,平均像元欧氏距离为4.79。将LS-net部署到嵌入式设备中,其帧数每秒达到78.2帧,3D拾取点定位和茎位估计的成功率分别为84.07%和81.32%。LS-net及其相关方法为草莓采摘机器人提供了一种适合嵌入式设备的视觉识别解决方案。
{"title":"A lightweight keypoint detection model-based method for strawberry recognition and picking point localization in multi-occlusion scenes","authors":"Dezhi Wang,&nbsp;Xiaochan Wang,&nbsp;Yinyan Shi,&nbsp;Xiaolei Zhang,&nbsp;Yanyu Chen,&nbsp;Jinming Zheng,&nbsp;Nan Liu","doi":"10.1016/j.aiia.2025.10.009","DOIUrl":"10.1016/j.aiia.2025.10.009","url":null,"abstract":"<div><div>Strawberries grown on elevated stands usually suffer from fruit occlusion issues, which severely limit the implementation of strawberry recognition and picking point localization, and the embedded devices carried by strawberry picking robots have high requirements for model lightweighting, posing a dual challenge to the efficient execution of automated picking tasks by robots. To address this issue, this study proposes a method for strawberry recognition and picking point localization in multi-occlusion scenes based on a lightweight keypoint detection model. Firstly, a strawberry dataset covering no, slight, moderate, and heavy occlusion scenes is constructed. Then, a lightweight strawberry recognition and keypoint detection network, LS-net, is proposed. LS-net improves the spatial relationship modelling capability between strawberries and stems by integrating the lightweight MobileNetv4 backbone with the Mobile Grouped-Query Attention mechanism; improves the feature pyramid network using depthwise separable convolutions and incorporates an anchor-free decoupled head network to reduce computational complexity while maintaining detection accuracy; and introduces the Matrix Non-Maximum Suppression to optimize the processing of overlapping strawberries, which effectively reduces the false negative detections. Based on the keypoint detection results from LS-net, the picking point coordinates and stem pose are calculated after a series of processes such as region-of-interest extraction, binarization, and depth data alignment. The experimental results show that the accuracy of LS-net is 91.07 %, the mean average precision is 93.93 %, and the average pixel Euclidean distance is 4.79. By deploying LS-net to the embedded device, its frames per second reaches 78.2, and the success rates of 3D picking point localization and stem pose estimation are 84.07 % and 81.32 %, respectively. LS-net and related methods provide a visual recognition solution adapted to embedded devices for strawberry picking robots.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 316-341"},"PeriodicalIF":12.4,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence in Agriculture
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1