Jaydeo K. Dharpure , Ian M. Howat , Saurabh Kaushik , Bryan G. Mark
{"title":"Combining machine learning algorithms for bridging gaps in GRACE and GRACE Follow-On missions using ERA5-Land reanalysis","authors":"Jaydeo K. Dharpure , Ian M. Howat , Saurabh Kaushik , Bryan G. Mark","doi":"10.1016/j.srs.2025.100198","DOIUrl":null,"url":null,"abstract":"<div><div>The Gravity Recovery and Climate Experiment (GRACE) and GRACE Follow-On (GFO) missions have provided valuable data for monitoring global terrestrial water storage anomalies (TWSA) over the past two decades. However, the nearly one-year gap between these missions pose challenges for long-term TWSA measurements and various applications. Unlike previous studies, we use a combination of Machine Learning (ML) methods—Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGB), Deep Neural Network (DNN), and Stacked Long-Short Term Memory (SLSTM)—to identify and efficiently bridge the gap between GRACE and GFO by using the best-performing ML model to estimate TWSA at each grid cell. The models were trained using six hydroclimatic variables (temperature, precipitation, runoff, evapotranspiration, ERA5-Land derived TWSA, and cumulative water storage change), as well as a vegetation index and timing variables, to reconstruct global land TWSA at 0.5° grid resolution. We evaluated the performance of each model using Nash-Sutcliffe Efficiency (NSE), Pearson's Correlation Coefficient (PCC), and Root Mean Square Error (RMSE). Our results demonstrate test accuracy with area weighted average NSE, PCC, and RMSE of 0.51 ± 0.31, 0.71 ± 0.23, and 4.75 ± 3.63 cm, respectively. The model's performance was further compared across five climatic zones, with two previously reconstructed products (Li and Humphrey methods) at 26 major river basins, during flood/drought events, and for sea-level rise. Our results showcase the model's superior performance and its capability to accurately predict data gaps at both grid and basin scales globally.</div></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"11 ","pages":"Article 100198"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017225000045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The Gravity Recovery and Climate Experiment (GRACE) and GRACE Follow-On (GFO) missions have provided valuable data for monitoring global terrestrial water storage anomalies (TWSA) over the past two decades. However, the nearly one-year gap between these missions pose challenges for long-term TWSA measurements and various applications. Unlike previous studies, we use a combination of Machine Learning (ML) methods—Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGB), Deep Neural Network (DNN), and Stacked Long-Short Term Memory (SLSTM)—to identify and efficiently bridge the gap between GRACE and GFO by using the best-performing ML model to estimate TWSA at each grid cell. The models were trained using six hydroclimatic variables (temperature, precipitation, runoff, evapotranspiration, ERA5-Land derived TWSA, and cumulative water storage change), as well as a vegetation index and timing variables, to reconstruct global land TWSA at 0.5° grid resolution. We evaluated the performance of each model using Nash-Sutcliffe Efficiency (NSE), Pearson's Correlation Coefficient (PCC), and Root Mean Square Error (RMSE). Our results demonstrate test accuracy with area weighted average NSE, PCC, and RMSE of 0.51 ± 0.31, 0.71 ± 0.23, and 4.75 ± 3.63 cm, respectively. The model's performance was further compared across five climatic zones, with two previously reconstructed products (Li and Humphrey methods) at 26 major river basins, during flood/drought events, and for sea-level rise. Our results showcase the model's superior performance and its capability to accurately predict data gaps at both grid and basin scales globally.