P. Goodling , K. Belitz , P. Stackelberg , B. Fleming
{"title":"根据噪声数据开发的空间机器学习模型需要进行多尺度性能评估:预测美国特拉华河流域的基岩深度","authors":"P. Goodling , K. Belitz , P. Stackelberg , B. Fleming","doi":"10.1016/j.envsoft.2024.106124","DOIUrl":null,"url":null,"abstract":"<div><p>Spatial machine learning models can be developed from observations with substantial unexplainable variability, sometimes called ‘noise’. Traditional point-scale metrics (e.g., R<sup>2</sup>) alone can be misleading when evaluating these models. We present a multi-scale performance evaluation (MPE) using two additional scales (distributional and geostatistical). We apply the MPE framework to predictions of depth to bedrock (DTB) in the Delaware River Basin. Geostatistical analysis shows that approximately one third of the DTB variance is at spatial scale smaller than 2 km. Hence, we interpret our point-scale R<sup>2</sup> of 0.3 (testing data) to be sufficient for regional-scale modelling. Bias-correction methods improve performance at two of the three MPE scales: point-scale change is negligible, while distributional and geostatistical performance improves. In contrast, bias correction applied to a global DTB model does not improve MPE performance. This work encourages scale-appropriate performance evaluations to enable effective model intercomparison.</p></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1364815224001853/pdfft?md5=7fedebb9a98cc4eebaa6f029bad61dfe&pid=1-s2.0-S1364815224001853-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A spatial machine learning model developed from noisy data requires multiscale performance evaluation: Predicting depth to bedrock in the Delaware river basin, USA\",\"authors\":\"P. Goodling , K. Belitz , P. Stackelberg , B. Fleming\",\"doi\":\"10.1016/j.envsoft.2024.106124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Spatial machine learning models can be developed from observations with substantial unexplainable variability, sometimes called ‘noise’. Traditional point-scale metrics (e.g., R<sup>2</sup>) alone can be misleading when evaluating these models. We present a multi-scale performance evaluation (MPE) using two additional scales (distributional and geostatistical). We apply the MPE framework to predictions of depth to bedrock (DTB) in the Delaware River Basin. Geostatistical analysis shows that approximately one third of the DTB variance is at spatial scale smaller than 2 km. Hence, we interpret our point-scale R<sup>2</sup> of 0.3 (testing data) to be sufficient for regional-scale modelling. Bias-correction methods improve performance at two of the three MPE scales: point-scale change is negligible, while distributional and geostatistical performance improves. In contrast, bias correction applied to a global DTB model does not improve MPE performance. This work encourages scale-appropriate performance evaluations to enable effective model intercomparison.</p></div>\",\"PeriodicalId\":310,\"journal\":{\"name\":\"Environmental Modelling & Software\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1364815224001853/pdfft?md5=7fedebb9a98cc4eebaa6f029bad61dfe&pid=1-s2.0-S1364815224001853-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Modelling & Software\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1364815224001853\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815224001853","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
A spatial machine learning model developed from noisy data requires multiscale performance evaluation: Predicting depth to bedrock in the Delaware river basin, USA
Spatial machine learning models can be developed from observations with substantial unexplainable variability, sometimes called ‘noise’. Traditional point-scale metrics (e.g., R2) alone can be misleading when evaluating these models. We present a multi-scale performance evaluation (MPE) using two additional scales (distributional and geostatistical). We apply the MPE framework to predictions of depth to bedrock (DTB) in the Delaware River Basin. Geostatistical analysis shows that approximately one third of the DTB variance is at spatial scale smaller than 2 km. Hence, we interpret our point-scale R2 of 0.3 (testing data) to be sufficient for regional-scale modelling. Bias-correction methods improve performance at two of the three MPE scales: point-scale change is negligible, while distributional and geostatistical performance improves. In contrast, bias correction applied to a global DTB model does not improve MPE performance. This work encourages scale-appropriate performance evaluations to enable effective model intercomparison.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.