Janez Lapajne , Andrej Vončina , Ana Vojnović , Daša Donša , Peter Dolničar , Uroš Žibrat
{"title":"Field-scale UAV-based multispectral phenomics: Leveraging machine learning, explainable AI, and hybrid feature engineering for enhancements in potato phenotyping","authors":"Janez Lapajne , Andrej Vončina , Ana Vojnović , Daša Donša , Peter Dolničar , Uroš Žibrat","doi":"10.1016/j.compag.2024.109746","DOIUrl":null,"url":null,"abstract":"<div><div>Fast and accurate identification of potato plant traits is essential for formulating effective cultivation strategies. The integration of spectral cameras on Unmanned Aerial Vehicles (UAVs) has demonstrated appealing potential, facilitating non-invasive investigations on a large scale by providing valuable features for construction of machine learning models. Nevertheless, interpreting these features, and those derived from them, remains a challenge, limiting confident utilization in real-world applications. In this study, the interpretability of machine learning models is addressed by employing SHAP (SHapley Additive exPlanations) and UMAP (Uniform Manifold Approximation and Projection) to better understand the modeling process. The XGBoost model was trained on a multispectral dataset of potato plants and evaluated on various tasks, i.e. variety classification, physiological measures estimation, and detection of early blight disease. To optimize its performance, nearly 100 vegetation indices and over 500 auto-generated features were utilized for training. The results indicate successful separation of plant varieties with up to 97.10% accuracy, estimation of physiological values with a maximum R<sup>2</sup> and rNRMSE of 0.57 and 0.129, respectively, and detection of early blight with an F1 score of 0.826. Furthermore, both UMAP and SHAP proved beneficial for comprehensive analysis. UMAP visual observations closely corresponded to computed metrics, enhancing confidence for variety differentiation. Concurrently, SHAP identified the most informative features – green, red edge, and NIR channels – for most tasks, aligning tightly with existing literature. This study highlights potential improvements in farming efficiency, crop yield, and sustainability, and promotes the development of interpretable machine learning models for remote sensing applications.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"229 ","pages":"Article 109746"},"PeriodicalIF":7.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169924011372","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Fast and accurate identification of potato plant traits is essential for formulating effective cultivation strategies. The integration of spectral cameras on Unmanned Aerial Vehicles (UAVs) has demonstrated appealing potential, facilitating non-invasive investigations on a large scale by providing valuable features for construction of machine learning models. Nevertheless, interpreting these features, and those derived from them, remains a challenge, limiting confident utilization in real-world applications. In this study, the interpretability of machine learning models is addressed by employing SHAP (SHapley Additive exPlanations) and UMAP (Uniform Manifold Approximation and Projection) to better understand the modeling process. The XGBoost model was trained on a multispectral dataset of potato plants and evaluated on various tasks, i.e. variety classification, physiological measures estimation, and detection of early blight disease. To optimize its performance, nearly 100 vegetation indices and over 500 auto-generated features were utilized for training. The results indicate successful separation of plant varieties with up to 97.10% accuracy, estimation of physiological values with a maximum R2 and rNRMSE of 0.57 and 0.129, respectively, and detection of early blight with an F1 score of 0.826. Furthermore, both UMAP and SHAP proved beneficial for comprehensive analysis. UMAP visual observations closely corresponded to computed metrics, enhancing confidence for variety differentiation. Concurrently, SHAP identified the most informative features – green, red edge, and NIR channels – for most tasks, aligning tightly with existing literature. This study highlights potential improvements in farming efficiency, crop yield, and sustainability, and promotes the development of interpretable machine learning models for remote sensing applications.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.