{"title":"SHAP-NET, a network based on Shapley values as a new tool to improve the explainability of the XGBoost-SHAP model for the problem of water quality","authors":"Marek Kruk","doi":"10.1016/j.envsoft.2025.106403","DOIUrl":null,"url":null,"abstract":"<div><div>The aim of this work is to find an effective combination of modelling based on the boosting technique and Shapley value computation with the practise of evaluating an undirected graph model. To this end, we created an XGBoost-SHAP regression model in which the target variable is the cyanobacteria concentration and the model variables consist of 20 environmental factors. Two partial correlation-based graphs were then created. Firstly, a preliminary network containing all the features (with the target variable) with the original datasets of the parameters, and secondly, a network called SHAP-NET based on the Shapley values of the independent variables from the SHAP model. It seems that by using new combined machine learning and network tools such as SHAP-NET, it will be possible to further improve the idea of explainability of models in the field of XAI (eXplainable Artificial Intelligence), and attempts to solve practical domain problems, as in this work, can contribute to progress in this area.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"188 ","pages":"Article 106403"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225000878","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of this work is to find an effective combination of modelling based on the boosting technique and Shapley value computation with the practise of evaluating an undirected graph model. To this end, we created an XGBoost-SHAP regression model in which the target variable is the cyanobacteria concentration and the model variables consist of 20 environmental factors. Two partial correlation-based graphs were then created. Firstly, a preliminary network containing all the features (with the target variable) with the original datasets of the parameters, and secondly, a network called SHAP-NET based on the Shapley values of the independent variables from the SHAP model. It seems that by using new combined machine learning and network tools such as SHAP-NET, it will be possible to further improve the idea of explainability of models in the field of XAI (eXplainable Artificial Intelligence), and attempts to solve practical domain problems, as in this work, can contribute to progress in this area.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.