Levent Latifoğlu, Savaş Bayram, Gaye Aktürk, Hatice Citakoglu
{"title":"Drought index time series forecasting via three-in-one machine learning concept for the Euphrates basin","authors":"Levent Latifoğlu, Savaş Bayram, Gaye Aktürk, Hatice Citakoglu","doi":"10.1007/s12145-024-01471-8","DOIUrl":null,"url":null,"abstract":"<p>Droughts are among the most hazardous and costly natural disasters and are hard to quantify and characterize. Accurate drought forecasting reduces droughts' devastating economic effects on ecosystems and people. Eastern Anatolia is the largest and coldest geographical region of Türkiye. Previous studies lack drought forecasting in the Eastern Anatolia (Upper Mesopotamia) Region, where agriculture is limited due to being under snow most of the year. This study focuses on the Euphrates basin, specifically the Tercan and the Tunceli meteorological stations of the Karasu River sub-basin, a vital Eastern Anatolia Region water resource. In this context, time series of 1-, 3-, 6-, 9-, and 12-month Standardized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI) values were created. The Tuned Q-factor Wavelet Transform (TQWT) method and Univariate Feature Ranking Using F-Tests (FSRFtest) were used for pre-processing and feature selection. Several models were created, such as stand-alone, hybrid, and tribrid. Machine Learning (ML) methods such as Artificial Neural Networks (ANN), Gaussian Process Regression (GPR), and Support Vector Machine (SVM) were conducted for the time series analyses. The GPR approach was concluded to perform better than the ANN and SVM at the Tercan station. In other words, GPR performs better in 80% of cases than SVM and ANN models. At the Tunceli station for the SPI output, SVM, which had a superior performance in 60% of the cases, demonstrated a performance comparable to GPR. At the same time, ANN once again exhibited an inferior performance. Similarly, for the SPEI output at the Tunceli station, no clear superiority was observed between the GPR and ANN methods. Because both methods were successful in 40% of cases. This study contributes by introducing a third concept to the stand-alone and hybrid model comparison of drought forecasting, adding tribrid models. It has been detected that the Hybrid and Tribrid ML methods lead to a 91% and 64% decrease relative root mean square error percentage compared stand-alone ML methods for SPEI and SPI in two stations. While the hybrid model at Tercan station was more successful in 80% of the cases, the hybrid model at Tercan station was more successful in 90% of the cases. While hybrid models were observed to be superior, tribrid models not only demonstrated performance close to the hybrid models but also provided advantages such as reducing computational load and shortening calculation time.</p>","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"11 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth Science Informatics","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s12145-024-01471-8","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Droughts are among the most hazardous and costly natural disasters and are hard to quantify and characterize. Accurate drought forecasting reduces droughts' devastating economic effects on ecosystems and people. Eastern Anatolia is the largest and coldest geographical region of Türkiye. Previous studies lack drought forecasting in the Eastern Anatolia (Upper Mesopotamia) Region, where agriculture is limited due to being under snow most of the year. This study focuses on the Euphrates basin, specifically the Tercan and the Tunceli meteorological stations of the Karasu River sub-basin, a vital Eastern Anatolia Region water resource. In this context, time series of 1-, 3-, 6-, 9-, and 12-month Standardized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI) values were created. The Tuned Q-factor Wavelet Transform (TQWT) method and Univariate Feature Ranking Using F-Tests (FSRFtest) were used for pre-processing and feature selection. Several models were created, such as stand-alone, hybrid, and tribrid. Machine Learning (ML) methods such as Artificial Neural Networks (ANN), Gaussian Process Regression (GPR), and Support Vector Machine (SVM) were conducted for the time series analyses. The GPR approach was concluded to perform better than the ANN and SVM at the Tercan station. In other words, GPR performs better in 80% of cases than SVM and ANN models. At the Tunceli station for the SPI output, SVM, which had a superior performance in 60% of the cases, demonstrated a performance comparable to GPR. At the same time, ANN once again exhibited an inferior performance. Similarly, for the SPEI output at the Tunceli station, no clear superiority was observed between the GPR and ANN methods. Because both methods were successful in 40% of cases. This study contributes by introducing a third concept to the stand-alone and hybrid model comparison of drought forecasting, adding tribrid models. It has been detected that the Hybrid and Tribrid ML methods lead to a 91% and 64% decrease relative root mean square error percentage compared stand-alone ML methods for SPEI and SPI in two stations. While the hybrid model at Tercan station was more successful in 80% of the cases, the hybrid model at Tercan station was more successful in 90% of the cases. While hybrid models were observed to be superior, tribrid models not only demonstrated performance close to the hybrid models but also provided advantages such as reducing computational load and shortening calculation time.
期刊介绍:
The Earth Science Informatics [ESIN] journal aims at rapid publication of high-quality, current, cutting-edge, and provocative scientific work in the area of Earth Science Informatics as it relates to Earth systems science and space science. This includes articles on the application of formal and computational methods, computational Earth science, spatial and temporal analyses, and all aspects of computer applications to the acquisition, storage, processing, interchange, and visualization of data and information about the materials, properties, processes, features, and phenomena that occur at all scales and locations in the Earth system’s five components (atmosphere, hydrosphere, geosphere, biosphere, cryosphere) and in space (see "About this journal" for more detail). The quarterly journal publishes research, methodology, and software articles, as well as editorials, comments, and book and software reviews. Review articles of relevant findings, topics, and methodologies are also considered.