{"title":"Prediction and Interpretation of Total N and Its Key Drivers in Cultivated Tropical Peat using Machine Learning and Game Theory","authors":"Heru Bagus Pulunggono, Yusuf Azmi Madani Madani, Lina Lathifah Nurazizah, Moh Zulfajrin","doi":"10.52045/jca.v4i1.592","DOIUrl":null,"url":null,"abstract":"Currently, there is a growing interest among research communities in the development of statistical learning-based pedotransfer functions/PtFs to predict mineral soil nutrients; however, similar studies in peatlands are relatively rare. Moreover, extracting meaningful information from these ‘black-box’ models is crucial, particularly concerning their algorithmic complexity and the non-linear nature of the soil covariate interrelationships. This study employed the Pulunggono (2022a) dataset and the bootstrapping method, to (1) develop and evaluate seven PtF models, including both general linear models (GLM) and machine learning (ML) regressors for estimating total nitrogen (N) in tropical peat that has been drained and cultivated for oil palm (OP) in Riau, Indonesia and (2) explaining model functioning by incorporating Shapley Additive Explanation (SHAP), a tool derived from coalitional game theory. This study demonstrated the superior predictive performance of ML-based PtFs in estimating total N compared to GLM algorithms. The top-performing algorithms for PtF models were identified as GBM, XGB, and Cubist. The SHAP method revealed that sampling depth and organic C were consistently identified as the most important covariates across all models, irrespective of their algorithmic capabilities. Additionally, ML algorithms identified the total Fe, pH, and bulk density (BD) as significant covariates. Local explanations based on Shapley values indicated that the behavior of PtF-based algorithms diverged from their global explanations. This study emphasized the critical role of ML algorithms and game theory in accurately predicting total N in peatlands subjected to drainage and cultivation for OP and explaining their model behavior in relation to soil biogeochemical processes.","PeriodicalId":9663,"journal":{"name":"CELEBES Agricultural","volume":"114 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CELEBES Agricultural","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52045/jca.v4i1.592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Currently, there is a growing interest among research communities in the development of statistical learning-based pedotransfer functions/PtFs to predict mineral soil nutrients; however, similar studies in peatlands are relatively rare. Moreover, extracting meaningful information from these ‘black-box’ models is crucial, particularly concerning their algorithmic complexity and the non-linear nature of the soil covariate interrelationships. This study employed the Pulunggono (2022a) dataset and the bootstrapping method, to (1) develop and evaluate seven PtF models, including both general linear models (GLM) and machine learning (ML) regressors for estimating total nitrogen (N) in tropical peat that has been drained and cultivated for oil palm (OP) in Riau, Indonesia and (2) explaining model functioning by incorporating Shapley Additive Explanation (SHAP), a tool derived from coalitional game theory. This study demonstrated the superior predictive performance of ML-based PtFs in estimating total N compared to GLM algorithms. The top-performing algorithms for PtF models were identified as GBM, XGB, and Cubist. The SHAP method revealed that sampling depth and organic C were consistently identified as the most important covariates across all models, irrespective of their algorithmic capabilities. Additionally, ML algorithms identified the total Fe, pH, and bulk density (BD) as significant covariates. Local explanations based on Shapley values indicated that the behavior of PtF-based algorithms diverged from their global explanations. This study emphasized the critical role of ML algorithms and game theory in accurately predicting total N in peatlands subjected to drainage and cultivation for OP and explaining their model behavior in relation to soil biogeochemical processes.