{"title":"改进茶园生态系统的碳通量估算:机器学习集合方法","authors":"Ali Raza, Yongguang Hu, Yongzong Lu","doi":"10.1016/j.eja.2024.127297","DOIUrl":null,"url":null,"abstract":"<div><p>Tea plant (<em>Camellia sinensis</em>) is a major global crop consumed as a drink after water. Quantifying carbon flux, specifically the net ecosystem exchange (NEE), in tea plantations is essential for determining carbon sequestration and ecosystem carbon balance. The Eddy covariance (EC) system is widely used for continuous monitoring of carbon flux but high costs associated with installation and maintenance limit its widespread adoption. In addition, EC flux data is often discarded due to malfunction of instruments caused by adverse weather conditions. Therefore, additional approaches for estimating NEE are necessary to overcome these challenges and ensure accurate NEE measurement. For this purpose, three standalone tree-based machine learning (ML) models were used for NEE estimation using EC flux data collected from tea ecosystem located in subtropical region (Danyang county of Zhenjiang city) of China. To address the accuracy limitations inherent in standalone ML models, the ensemble mechanism based on voting regressor method was proposed. In addition, k-fold cross-validation based on early stopping process was also used to enhance the performance of standalone ML models. Based on visual plots (scatter diagram, heatMap, Taylor diagram) and performing indices (root-mean-square error (RMSE), determination coefficient (R<sup>2</sup>), mean absolute error (MAE), mean absolute percentage error (MAPE), Nash–Sutcliffe efficiency (NSE), correlation coefficient (r), Kling Gupta Efficiency (KGE) and index of agreement (d)), the findings indicated that non-linear ensemble-generalized regression neural network (NLE-GRNN) significantly improved standalone ML model's results. In current study, the highest NSE, r and d in case of standalone ML model (DT) achieved 0.49, 0.73 and 0.75 respectively while our proposed NLE-GRNN model improved 48 % in NSE value (NSE = 0.97), 25 % in r value (r = 0.98) and 24 % in d value (d = 0.99). Likewise, NLE-GRNN significantly reduce errors (MAE, MAPE and RMSE) and provides NEE estimate closet to the observed value. The impact of climatic variables on NEE using shapley additive explanations (SHAP) analysis revealed that Rg (solar radiation) and Tair (air temperature) were the prime factors controlling NEE variation in the tea ecosystem. Considering the high accuracy and stability of the studied ML models, it is recommended to apply developed ensemble ML model (NLE-GRNN) for significant improvement of NEE estimate in the tea biomes or other ecosystems.</p></div>","PeriodicalId":51045,"journal":{"name":"European Journal of Agronomy","volume":"160 ","pages":"Article 127297"},"PeriodicalIF":4.5000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving carbon flux estimation in tea plantation ecosystems: A machine learning ensemble approach\",\"authors\":\"Ali Raza, Yongguang Hu, Yongzong Lu\",\"doi\":\"10.1016/j.eja.2024.127297\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Tea plant (<em>Camellia sinensis</em>) is a major global crop consumed as a drink after water. Quantifying carbon flux, specifically the net ecosystem exchange (NEE), in tea plantations is essential for determining carbon sequestration and ecosystem carbon balance. The Eddy covariance (EC) system is widely used for continuous monitoring of carbon flux but high costs associated with installation and maintenance limit its widespread adoption. In addition, EC flux data is often discarded due to malfunction of instruments caused by adverse weather conditions. Therefore, additional approaches for estimating NEE are necessary to overcome these challenges and ensure accurate NEE measurement. For this purpose, three standalone tree-based machine learning (ML) models were used for NEE estimation using EC flux data collected from tea ecosystem located in subtropical region (Danyang county of Zhenjiang city) of China. To address the accuracy limitations inherent in standalone ML models, the ensemble mechanism based on voting regressor method was proposed. In addition, k-fold cross-validation based on early stopping process was also used to enhance the performance of standalone ML models. Based on visual plots (scatter diagram, heatMap, Taylor diagram) and performing indices (root-mean-square error (RMSE), determination coefficient (R<sup>2</sup>), mean absolute error (MAE), mean absolute percentage error (MAPE), Nash–Sutcliffe efficiency (NSE), correlation coefficient (r), Kling Gupta Efficiency (KGE) and index of agreement (d)), the findings indicated that non-linear ensemble-generalized regression neural network (NLE-GRNN) significantly improved standalone ML model's results. In current study, the highest NSE, r and d in case of standalone ML model (DT) achieved 0.49, 0.73 and 0.75 respectively while our proposed NLE-GRNN model improved 48 % in NSE value (NSE = 0.97), 25 % in r value (r = 0.98) and 24 % in d value (d = 0.99). Likewise, NLE-GRNN significantly reduce errors (MAE, MAPE and RMSE) and provides NEE estimate closet to the observed value. The impact of climatic variables on NEE using shapley additive explanations (SHAP) analysis revealed that Rg (solar radiation) and Tair (air temperature) were the prime factors controlling NEE variation in the tea ecosystem. Considering the high accuracy and stability of the studied ML models, it is recommended to apply developed ensemble ML model (NLE-GRNN) for significant improvement of NEE estimate in the tea biomes or other ecosystems.</p></div>\",\"PeriodicalId\":51045,\"journal\":{\"name\":\"European Journal of Agronomy\",\"volume\":\"160 \",\"pages\":\"Article 127297\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2024-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Agronomy\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1161030124002181\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Agronomy","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1161030124002181","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
摘要
茶树(Camellia sinensis)是全球主要的农作物,是仅次于水的饮料。量化茶园中的碳通量,特别是生态系统净交换量(NEE),对于确定碳固存和生态系统碳平衡至关重要。涡度协方差(EC)系统被广泛用于连续监测碳通量,但其安装和维护成本高昂,限制了其广泛应用。此外,由于恶劣天气条件导致仪器故障,EC 通量数据经常被丢弃。因此,有必要采用其他方法估算净能效,以克服这些挑战并确保准确测量净能效。为此,研究人员利用从中国亚热带地区(镇江市丹阳县)茶叶生态系统收集到的欧共体通量数据,使用了三种基于树的独立机器学习(ML)模型来估算净能效。针对独立 ML 模型固有的精度限制,提出了基于投票回归方法的集合机制。此外,还使用了基于早期停止过程的 k 倍交叉验证来提高独立 ML 模型的性能。根据直观图(散点图、热图、泰勒图)和性能指标(均方根误差(RMSE)、判定系数(R2)、平均绝对误差(MAE)、平均绝对百分比误差(MAPE)、纳什-苏特克利夫效率(NSE)、研究结果表明,非线性集合-广义回归神经网络(NLE-GRNN)显著改善了独立 ML 模型的结果。在当前研究中,独立 ML 模型(DT)的最高 NSE、r 和 d 分别为 0.49、0.73 和 0.75,而我们提出的 NLE-GRNN 模型的 NSE 值(NSE = 0.97)提高了 48%,r 值(r = 0.98)提高了 25%,d 值(d = 0.99)提高了 24%。同样,NLE-GRNN 显著减少了误差(MAE、MAPE 和 RMSE),并提供了接近观测值的 NEE 估计值。使用夏普利加法解释(SHAP)分析气候变量对 NEE 的影响,发现 Rg(太阳辐射)和 Tair(气温)是控制茶叶生态系统 NEE 变化的主要因素。考虑到所研究的 ML 模型的高精度和稳定性,建议应用所开发的集合 ML 模型(NLE-GRNN),以显著改善茶叶生物群落或其他生态系统中的 NEE 估算。
Improving carbon flux estimation in tea plantation ecosystems: A machine learning ensemble approach
Tea plant (Camellia sinensis) is a major global crop consumed as a drink after water. Quantifying carbon flux, specifically the net ecosystem exchange (NEE), in tea plantations is essential for determining carbon sequestration and ecosystem carbon balance. The Eddy covariance (EC) system is widely used for continuous monitoring of carbon flux but high costs associated with installation and maintenance limit its widespread adoption. In addition, EC flux data is often discarded due to malfunction of instruments caused by adverse weather conditions. Therefore, additional approaches for estimating NEE are necessary to overcome these challenges and ensure accurate NEE measurement. For this purpose, three standalone tree-based machine learning (ML) models were used for NEE estimation using EC flux data collected from tea ecosystem located in subtropical region (Danyang county of Zhenjiang city) of China. To address the accuracy limitations inherent in standalone ML models, the ensemble mechanism based on voting regressor method was proposed. In addition, k-fold cross-validation based on early stopping process was also used to enhance the performance of standalone ML models. Based on visual plots (scatter diagram, heatMap, Taylor diagram) and performing indices (root-mean-square error (RMSE), determination coefficient (R2), mean absolute error (MAE), mean absolute percentage error (MAPE), Nash–Sutcliffe efficiency (NSE), correlation coefficient (r), Kling Gupta Efficiency (KGE) and index of agreement (d)), the findings indicated that non-linear ensemble-generalized regression neural network (NLE-GRNN) significantly improved standalone ML model's results. In current study, the highest NSE, r and d in case of standalone ML model (DT) achieved 0.49, 0.73 and 0.75 respectively while our proposed NLE-GRNN model improved 48 % in NSE value (NSE = 0.97), 25 % in r value (r = 0.98) and 24 % in d value (d = 0.99). Likewise, NLE-GRNN significantly reduce errors (MAE, MAPE and RMSE) and provides NEE estimate closet to the observed value. The impact of climatic variables on NEE using shapley additive explanations (SHAP) analysis revealed that Rg (solar radiation) and Tair (air temperature) were the prime factors controlling NEE variation in the tea ecosystem. Considering the high accuracy and stability of the studied ML models, it is recommended to apply developed ensemble ML model (NLE-GRNN) for significant improvement of NEE estimate in the tea biomes or other ecosystems.
期刊介绍:
The European Journal of Agronomy, the official journal of the European Society for Agronomy, publishes original research papers reporting experimental and theoretical contributions to field-based agronomy and crop science. The journal will consider research at the field level for agricultural, horticultural and tree crops, that uses comprehensive and explanatory approaches. The EJA covers the following topics:
crop physiology
crop production and management including irrigation, fertilization and soil management
agroclimatology and modelling
plant-soil relationships
crop quality and post-harvest physiology
farming and cropping systems
agroecosystems and the environment
crop-weed interactions and management
organic farming
horticultural crops
papers from the European Society for Agronomy bi-annual meetings
In determining the suitability of submitted articles for publication, particular scrutiny is placed on the degree of novelty and significance of the research and the extent to which it adds to existing knowledge in agronomy.