Pouya Hosseinzadeh, A. Nassar, S. F. Boubrahimi, S. M. Hamdi
{"title":"ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data","authors":"Pouya Hosseinzadeh, A. Nassar, S. F. Boubrahimi, S. M. Hamdi","doi":"10.3390/hydrology10020029","DOIUrl":null,"url":null,"abstract":"Streamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-Term Memory (LSTM), Seasonal Auto- Regressive Integrated Moving Average (SARIMA), and Facebook Prophet (PROPHET) to predict 24 months ahead of natural streamflow at the Lees Ferry site located at the bottom part of the Upper Colorado River Basin (UCRB) of the US. Firstly, we used only historic streamflow data to predict 24 months ahead. Secondly, we considered meteorological components such as temperature and precipitation as additional features. We tested the models on a monthly test dataset spanning 6 years, where 24-month predictions were repeated 50 times to ensure the consistency of the results. Moreover, we performed a sensitivity analysis to identify our best-performing model. Later, we analyzed the effects of considering different span window sizes on the quality of predictions made by our best model. Finally, we applied our best-performing model, RFR, on two more rivers in different states in the UCRB to test the model’s generalizability. We evaluated the performance of the predictive models using multiple evaluation measures. The predictions in multivariate time-series models were found to be more accurate, with RMSE less than 0.84 mm per month, R-squared more than 0.8, and MAPE less than 0.25. Therefore, we conclude that the temperature and precipitation of the UCRB increases the accuracy of the predictions. Ultimately, we found that multivariate RFR performs the best among four models and is generalizable to other rivers in the UCRB.","PeriodicalId":37372,"journal":{"name":"Hydrology","volume":" ","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hydrology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/hydrology10020029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 5
Abstract
Streamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-Term Memory (LSTM), Seasonal Auto- Regressive Integrated Moving Average (SARIMA), and Facebook Prophet (PROPHET) to predict 24 months ahead of natural streamflow at the Lees Ferry site located at the bottom part of the Upper Colorado River Basin (UCRB) of the US. Firstly, we used only historic streamflow data to predict 24 months ahead. Secondly, we considered meteorological components such as temperature and precipitation as additional features. We tested the models on a monthly test dataset spanning 6 years, where 24-month predictions were repeated 50 times to ensure the consistency of the results. Moreover, we performed a sensitivity analysis to identify our best-performing model. Later, we analyzed the effects of considering different span window sizes on the quality of predictions made by our best model. Finally, we applied our best-performing model, RFR, on two more rivers in different states in the UCRB to test the model’s generalizability. We evaluated the performance of the predictive models using multiple evaluation measures. The predictions in multivariate time-series models were found to be more accurate, with RMSE less than 0.84 mm per month, R-squared more than 0.8, and MAPE less than 0.25. Therefore, we conclude that the temperature and precipitation of the UCRB increases the accuracy of the predictions. Ultimately, we found that multivariate RFR performs the best among four models and is generalizable to other rivers in the UCRB.
HydrologyEarth and Planetary Sciences-Earth-Surface Processes
CiteScore
4.90
自引率
21.90%
发文量
192
审稿时长
6 weeks
期刊介绍:
Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences, including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology, hydrogeology and hydrogeophysics. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, ecohydrology, geomorphology, soil science, instrumentation and remote sensing, data and information sciences, civil and environmental engineering are within scope. Social science perspectives on hydrological problems such as resource and ecological economics, sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site. Studies focused on urban hydrological issues are included.