{"title":"Predicting Daily River Chlorophyll Concentrations at a Continental Scale","authors":"Philip Savoy, Judson W. Harvey","doi":"10.1029/2022wr034215","DOIUrl":null,"url":null,"abstract":"Eutrophication is one of the largest threats to aquatic ecosystems and chlorophyll <i>a</i> measurements are relevant indicators of trophic state and algal abundance. Many studies have modeled chlorophyll <i>a</i> in rivers but model development and testing has largely occurred at individual sites which hampers creating generalized models capable of making broad-scale predictions. To address this gap, we compiled a large data set of chlorophyll <i>a</i> concentrations matched to other water quality, meteorological, and reach characteristic data for a diverse set of 82 streams and rivers across the United States. We used this data set and extreme gradient boosting, a tree-based machine learning algorithm, to predict daily chlorophyll <i>a</i> concentrations. Furthermore, we tested several practical considerations of broad-scale models, such as making predictions at sites not included in model training or the utility of in situ water quality data versus universally available remotely estimated model inputs. Predictions were very strongly correlated to observations when compared against a randomly withheld subset of days; however, the model had lower accuracy when applied to completely novel sites withheld from model training. Turbidity and total nitrogen were the two most important variables for predicting chlorophyll <i>a</i>. Although in situ variables improved modeled estimates and were identified as more important during model interpretation, using only remote inputs still resulted in highly correlated predictions with small bias. Testing a model across many sites allowed for identification of common variables relevant to chlorophyll <i>a</i> and highlighted several challenges for applying data-driven models to new sites or at larger spatial scales.","PeriodicalId":23799,"journal":{"name":"Water Resources Research","volume":"58 12","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2022wr034215","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Eutrophication is one of the largest threats to aquatic ecosystems and chlorophyll a measurements are relevant indicators of trophic state and algal abundance. Many studies have modeled chlorophyll a in rivers but model development and testing has largely occurred at individual sites which hampers creating generalized models capable of making broad-scale predictions. To address this gap, we compiled a large data set of chlorophyll a concentrations matched to other water quality, meteorological, and reach characteristic data for a diverse set of 82 streams and rivers across the United States. We used this data set and extreme gradient boosting, a tree-based machine learning algorithm, to predict daily chlorophyll a concentrations. Furthermore, we tested several practical considerations of broad-scale models, such as making predictions at sites not included in model training or the utility of in situ water quality data versus universally available remotely estimated model inputs. Predictions were very strongly correlated to observations when compared against a randomly withheld subset of days; however, the model had lower accuracy when applied to completely novel sites withheld from model training. Turbidity and total nitrogen were the two most important variables for predicting chlorophyll a. Although in situ variables improved modeled estimates and were identified as more important during model interpretation, using only remote inputs still resulted in highly correlated predictions with small bias. Testing a model across many sites allowed for identification of common variables relevant to chlorophyll a and highlighted several challenges for applying data-driven models to new sites or at larger spatial scales.
期刊介绍:
Water Resources Research (WRR) is an interdisciplinary journal that focuses on hydrology and water resources. It publishes original research in the natural and social sciences of water. It emphasizes the role of water in the Earth system, including physical, chemical, biological, and ecological processes in water resources research and management, including social, policy, and public health implications. It encompasses observational, experimental, theoretical, analytical, numerical, and data-driven approaches that advance the science of water and its management. Submissions are evaluated for their novelty, accuracy, significance, and broader implications of the findings.