A novel framework for developing accurate and explainable leaf nitrogen content estimation model for aquilaria sinensis seedlings using canopy RGB imagery
{"title":"A novel framework for developing accurate and explainable leaf nitrogen content estimation model for aquilaria sinensis seedlings using canopy RGB imagery","authors":"Zhulin Chen , Xuefeng Wang","doi":"10.1016/j.biosystemseng.2025.02.003","DOIUrl":null,"url":null,"abstract":"<div><div>Leaf nitrogen content (LNC) is crucial for the cultivation and health management of the endangered tree species <em>Aquilaria sinensis</em>. Although RGB imagery combined with machine learning has been effective for non-destructive LNC estimation, current models often neglect colour index texture features and face feature selection and interpretability challenges. This study introduces a framework to address these issues. Firstly, the canopy RGB imagery colour indices and the texture features of <em>Aquilaria sinensis</em> seedlings were collected as an initial feature set. Then, an improved hybrid feature selection algorithm combining SHapley Additive exPlanation (SHAP) with a dynamic ranking strategy was applied with a regression algorithm. This approach was tested using random forest (RF), support vector regression (SVR), and deep neural network (DNN) models. Optimal feature subsets were identified for each model, and performance comparisons determined the best LNC estimation model. Results show that texture features derived from colour indices significantly enhance LNC estimation accuracy. The dynamic SHAP ranking method outperformed RF and fixed SHAP rankings in feature selection. The optimal model, a DNN with an R<sup>2</sup> of 0.946 and RMSE of 1.859 g kg<sup>−1</sup> included two colour indices and five colour index texture features. While the normalised red colour index had the highest contribution, texture features contributed more overall to model accuracy. This method can be extended to other biophysical and biochemical parameter estimations.</div></div>","PeriodicalId":9173,"journal":{"name":"Biosystems Engineering","volume":"251 ","pages":"Pages 128-144"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1537511025000297","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Leaf nitrogen content (LNC) is crucial for the cultivation and health management of the endangered tree species Aquilaria sinensis. Although RGB imagery combined with machine learning has been effective for non-destructive LNC estimation, current models often neglect colour index texture features and face feature selection and interpretability challenges. This study introduces a framework to address these issues. Firstly, the canopy RGB imagery colour indices and the texture features of Aquilaria sinensis seedlings were collected as an initial feature set. Then, an improved hybrid feature selection algorithm combining SHapley Additive exPlanation (SHAP) with a dynamic ranking strategy was applied with a regression algorithm. This approach was tested using random forest (RF), support vector regression (SVR), and deep neural network (DNN) models. Optimal feature subsets were identified for each model, and performance comparisons determined the best LNC estimation model. Results show that texture features derived from colour indices significantly enhance LNC estimation accuracy. The dynamic SHAP ranking method outperformed RF and fixed SHAP rankings in feature selection. The optimal model, a DNN with an R2 of 0.946 and RMSE of 1.859 g kg−1 included two colour indices and five colour index texture features. While the normalised red colour index had the highest contribution, texture features contributed more overall to model accuracy. This method can be extended to other biophysical and biochemical parameter estimations.
叶片氮含量(LNC)是濒危树种沉香(Aquilaria sinensis)栽培和健康管理的关键。尽管RGB图像与机器学习相结合对于非破坏性LNC估计是有效的,但目前的模型往往忽略了颜色指数纹理特征和人脸特征选择和可解释性的挑战。本研究引入了一个框架来解决这些问题。首先,采集沉香幼苗树冠RGB图像颜色指数和纹理特征作为初始特征集;然后,将SHapley加性解释(SHAP)与动态排序策略相结合的改进混合特征选择算法与回归算法相结合。该方法使用随机森林(RF)、支持向量回归(SVR)和深度神经网络(DNN)模型进行了测试。为每个模型确定最优特征子集,并通过性能比较确定最佳LNC估计模型。结果表明,颜色指数提取的纹理特征显著提高了LNC估计的精度。动态SHAP排序法在特征选择上优于RF和固定SHAP排序法。最优模型为包含2个颜色指数和5个颜色指数纹理特征的DNN, R2为0.946,RMSE为1.859 g kg - 1。虽然标准化的红色指数贡献最大,但纹理特征对模型准确性的贡献更大。该方法可推广到其他生物物理和生化参数的估计中。
期刊介绍:
Biosystems Engineering publishes research in engineering and the physical sciences that represent advances in understanding or modelling of the performance of biological systems for sustainable developments in land use and the environment, agriculture and amenity, bioproduction processes and the food chain. The subject matter of the journal reflects the wide range and interdisciplinary nature of research in engineering for biological systems.