Predicting Daily River Chlorophyll Concentrations at a Continental Scale

IF 4.6 1区地球科学 Q2 ENVIRONMENTAL SCIENCES Water Resources Research Pub Date : 2023-11-09 DOI:10.1029/2022wr034215

Philip Savoy, Judson W. Harvey

{"title":"Predicting Daily River Chlorophyll Concentrations at a Continental Scale","authors":"Philip Savoy, Judson W. Harvey","doi":"10.1029/2022wr034215","DOIUrl":null,"url":null,"abstract":"Eutrophication is one of the largest threats to aquatic ecosystems and chlorophyll a measurements are relevant indicators of trophic state and algal abundance. Many studies have modeled chlorophyll a in rivers but model development and testing has largely occurred at individual sites which hampers creating generalized models capable of making broad-scale predictions. To address this gap, we compiled a large data set of chlorophyll a concentrations matched to other water quality, meteorological, and reach characteristic data for a diverse set of 82 streams and rivers across the United States. We used this data set and extreme gradient boosting, a tree-based machine learning algorithm, to predict daily chlorophyll a concentrations. Furthermore, we tested several practical considerations of broad-scale models, such as making predictions at sites not included in model training or the utility of in situ water quality data versus universally available remotely estimated model inputs. Predictions were very strongly correlated to observations when compared against a randomly withheld subset of days; however, the model had lower accuracy when applied to completely novel sites withheld from model training. Turbidity and total nitrogen were the two most important variables for predicting chlorophyll a. Although in situ variables improved modeled estimates and were identified as more important during model interpretation, using only remote inputs still resulted in highly correlated predictions with small bias. Testing a model across many sites allowed for identification of common variables relevant to chlorophyll a and highlighted several challenges for applying data-driven models to new sites or at larger spatial scales.","PeriodicalId":23799,"journal":{"name":"Water Resources Research","volume":"58 12","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2022wr034215","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Eutrophication is one of the largest threats to aquatic ecosystems and chlorophyll a measurements are relevant indicators of trophic state and algal abundance. Many studies have modeled chlorophyll a in rivers but model development and testing has largely occurred at individual sites which hampers creating generalized models capable of making broad-scale predictions. To address this gap, we compiled a large data set of chlorophyll a concentrations matched to other water quality, meteorological, and reach characteristic data for a diverse set of 82 streams and rivers across the United States. We used this data set and extreme gradient boosting, a tree-based machine learning algorithm, to predict daily chlorophyll a concentrations. Furthermore, we tested several practical considerations of broad-scale models, such as making predictions at sites not included in model training or the utility of in situ water quality data versus universally available remotely estimated model inputs. Predictions were very strongly correlated to observations when compared against a randomly withheld subset of days; however, the model had lower accuracy when applied to completely novel sites withheld from model training. Turbidity and total nitrogen were the two most important variables for predicting chlorophyll a. Although in situ variables improved modeled estimates and were identified as more important during model interpretation, using only remote inputs still resulted in highly correlated predictions with small bias. Testing a model across many sites allowed for identification of common variables relevant to chlorophyll a and highlighted several challenges for applying data-driven models to new sites or at larger spatial scales.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在大陆尺度上预测每日河流叶绿素浓度

富营养化是对水生生态系统的最大威胁之一，叶绿素a的测量是营养状态和藻类丰度的相关指标。许多研究模拟了河流中的叶绿素a，但模型的开发和测试主要发生在个别地点，这妨碍了建立能够进行大规模预测的广义模型。为了解决这一差距，我们编制了一套与其他水质、气象和美国82条不同溪流和河流的叶绿素a浓度相匹配的大型数据集，并获得了特征数据。我们使用这个数据集和极端梯度增强(一种基于树的机器学习算法)来预测每日叶绿素a浓度。此外，我们还测试了大尺度模型的几个实际考虑因素，例如在模型训练中未包括的地点进行预测，或使用原位水质数据与普遍可用的远程估计模型输入。与随机保留的天数子集相比，预测与观察结果相关性非常强;然而，当该模型应用于完全不受模型训练的新地点时，其准确性较低。浊度和总氮是预测叶绿素a的两个最重要的变量。尽管原位变量改进了模型估计，并且在模型解释过程中被确定为更重要的变量，但仅使用远程输入仍然导致高度相关的预测和小偏差。在许多地点测试一个模型可以识别与叶绿素a相关的常见变量，并强调了将数据驱动模型应用于新地点或更大空间尺度的几个挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Water Resources Research 环境科学-湖沼学

CiteScore

8.80

自引率

13.00%

发文量

599

审稿时长

3.5 months

期刊介绍： Water Resources Research (WRR) is an interdisciplinary journal that focuses on hydrology and water resources. It publishes original research in the natural and social sciences of water. It emphasizes the role of water in the Earth system, including physical, chemical, biological, and ecological processes in water resources research and management, including social, policy, and public health implications. It encompasses observational, experimental, theoretical, analytical, numerical, and data-driven approaches that advance the science of water and its management. Submissions are evaluated for their novelty, accuracy, significance, and broader implications of the findings.