Pub Date : 2023-01-01DOI: 10.1175/aies-d-21-0012.1
A. Geiss, Joseph C. Hardin
Recently, deep convolutional neural networks (CNNs) have revolutionized image “super resolution” (SR), dramatically outperforming past methods for enhancing image resolution. They could be a boon for the many scientific fields that involve imaging or any regularly gridded datasets: satellite remote sensing, radar meteorology, medical imaging, numerical modeling, and so on. Unfortunately, while SR-CNNs produce visually compelling results, they do not necessarily conserve physical quantities between their low-resolution inputs and high-resolution outputs when applied to scientific datasets. Here, a method for “downsampling enforcement” in SR-CNNs is proposed. A differentiable operator is derived that, when applied as the final transfer function of a CNN, ensures the high-resolution outputs exactly reproduce the low-resolution inputs under 2D-average downsampling while improving performance of the SR schemes. The method is demonstrated across seven modern CNN-based SR schemes on several benchmark image datasets, and applications to weather radar, satellite imager, and climate model data are shown. The approach improves training time and performance while ensuring physical consistency between the super-resolved and low-resolution data. Recent advancements in using deep learning to increase the resolution of images have substantial potential across the many scientific fields that use images and image-like data. Most image super-resolution research has focused on the visual quality of outputs, however, and is not necessarily well suited for use with scientific data where known physics constraints may need to be enforced. Here, we introduce a method to modify existing deep neural network architectures so that they strictly conserve physical quantities in the input field when “super resolving” scientific data and find that the method can improve performance across a wide range of datasets and neural networks. Integration of known physics and adherence to established physical constraints into deep neural networks will be a critical step before their potential can be fully realized in the physical sciences.
{"title":"Strictly Enforcing Invertibility and Conservation in CNN-Based Super Resolution for Scientific Datasets","authors":"A. Geiss, Joseph C. Hardin","doi":"10.1175/aies-d-21-0012.1","DOIUrl":"https://doi.org/10.1175/aies-d-21-0012.1","url":null,"abstract":"\u0000Recently, deep convolutional neural networks (CNNs) have revolutionized image “super resolution” (SR), dramatically outperforming past methods for enhancing image resolution. They could be a boon for the many scientific fields that involve imaging or any regularly gridded datasets: satellite remote sensing, radar meteorology, medical imaging, numerical modeling, and so on. Unfortunately, while SR-CNNs produce visually compelling results, they do not necessarily conserve physical quantities between their low-resolution inputs and high-resolution outputs when applied to scientific datasets. Here, a method for “downsampling enforcement” in SR-CNNs is proposed. A differentiable operator is derived that, when applied as the final transfer function of a CNN, ensures the high-resolution outputs exactly reproduce the low-resolution inputs under 2D-average downsampling while improving performance of the SR schemes. The method is demonstrated across seven modern CNN-based SR schemes on several benchmark image datasets, and applications to weather radar, satellite imager, and climate model data are shown. The approach improves training time and performance while ensuring physical consistency between the super-resolved and low-resolution data.\u0000\u0000\u0000Recent advancements in using deep learning to increase the resolution of images have substantial potential across the many scientific fields that use images and image-like data. Most image super-resolution research has focused on the visual quality of outputs, however, and is not necessarily well suited for use with scientific data where known physics constraints may need to be enforced. Here, we introduce a method to modify existing deep neural network architectures so that they strictly conserve physical quantities in the input field when “super resolving” scientific data and find that the method can improve performance across a wide range of datasets and neural networks. Integration of known physics and adherence to established physical constraints into deep neural networks will be a critical step before their potential can be fully realized in the physical sciences.\u0000","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"140 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73273330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-27DOI: 10.1175/aies-d-22-0034.1
L. Le Toumelin, I. Gouttevin, N. Helbig, C. Galiez, Mathis Roux, F. Karbou
Estimating the impact of wind-driven snow transport requires modeling wind fields with a lower grid spacing than the spacing on the order of one or a few kilometers used in the current numerical weather prediction (NWP) systems. In this context, we introduce a new strategy to downscale wind fields from NWP systems to decametric scales, using high resolution (30m) topographic information. Our method (named DEVINE) leverage on a convolutional neural network (CNN), trained to replicate the behaviour of the complex atmospheric model ARPS, previously run on a large number (7279) of synthetic Gaussian topographies under controlled weather conditions. A 10-fold cross validation reveals that our CNN is able to accurately emulate the behavior of ARPS (mean absolute error for wind speed = 0.16 m/s). We then apply DEVINE to real cases in the Alps, i.e. downscaling wind fields forecasted by AROME NWP system using information from real alpine topographies. DEVINE proved able to reproduce main features of wind fields in complex terrain (acceleration on ridges, leeward deceleration, deviations around obstacles). Furthermore, an evaluation on quality checked observations acquired at 61 sites in the French Alps reveals an improved behaviour of the downscaled winds (AROME wind speed mean bias is reduced by 27% with DEVINE), especially at the most elevated and exposed stations. Wind direction is however only slightly modified. Hence, despite some current limitations inherited from the ARPS simulations setup, DEVINE appears as an efficient downscaling tool whose minimalist architecture, low input data requirements (NWP wind fields and high-resolution topography) and competitive computing times may be attractive for operational applications.
{"title":"Emulating the adaptation of wind fields to complex terrain with deep-learning","authors":"L. Le Toumelin, I. Gouttevin, N. Helbig, C. Galiez, Mathis Roux, F. Karbou","doi":"10.1175/aies-d-22-0034.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0034.1","url":null,"abstract":"\u0000Estimating the impact of wind-driven snow transport requires modeling wind fields with a lower grid spacing than the spacing on the order of one or a few kilometers used in the current numerical weather prediction (NWP) systems. In this context, we introduce a new strategy to downscale wind fields from NWP systems to decametric scales, using high resolution (30m) topographic information. Our method (named DEVINE) leverage on a convolutional neural network (CNN), trained to replicate the behaviour of the complex atmospheric model ARPS, previously run on a large number (7279) of synthetic Gaussian topographies under controlled weather conditions. A 10-fold cross validation reveals that our CNN is able to accurately emulate the behavior of ARPS (mean absolute error for wind speed = 0.16 m/s). We then apply DEVINE to real cases in the Alps, i.e. downscaling wind fields forecasted by AROME NWP system using information from real alpine topographies. DEVINE proved able to reproduce main features of wind fields in complex terrain (acceleration on ridges, leeward deceleration, deviations around obstacles). Furthermore, an evaluation on quality checked observations acquired at 61 sites in the French Alps reveals an improved behaviour of the downscaled winds (AROME wind speed mean bias is reduced by 27% with DEVINE), especially at the most elevated and exposed stations. Wind direction is however only slightly modified. Hence, despite some current limitations inherited from the ARPS simulations setup, DEVINE appears as an efficient downscaling tool whose minimalist architecture, low input data requirements (NWP wind fields and high-resolution topography) and competitive computing times may be attractive for operational applications.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74154222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-13DOI: 10.1175/aies-d-22-0033.1
Jiaxin Chen, I. Ashton, E. Steele, A. Pillai
The safe and successful operation of offshore infrastructure relies on a detailed awareness of ocean wave conditions. Ongoing growth in offshore wind energy is focused on very large-scale projects, deployed in ever-more challenging environments. This inherently increases both cost and complexity, and therefore the requirement for efficient operational planning. To support this, we propose a new machine learning framework for the short-term forecasting of ocean wave conditions, to support critical decision-making associated with marine operations. Here, an attention-based Long Short-Term Memory (LSTM) neural network approach is used to learn the short-term temporal patterns from in-situ observations. This is then integrated with an existing, low-computational cost spatial nowcasting model to develop a complete framework for spatio-temporal forecasting. The framework addresses the challenge of filling gaps in the in-situ observations, and undertakes feature selection, with seasonal training datasets embedded. The full spatio-temporal forecasting system is demonstrated using a case study based on independent observation locations near the southwest coast of the United Kingdom. Results are validated against in-situ data from two wave buoy locations within the domain and compared to operational physics-based wave forecasts from the Met Office (the UK’s national weather service). For these two example locations, the spatio-temporal forecast is found to have the accuracy of R2 0.9083 and 0.7409 in forecasting 1 hour ahead significant wave height, and R2 0.8581 and 0.6978 in 12 hour ahead forecasts, respectively. Importantly, this represents respectable levels of accuracy, comparable to traditional physics-based forecast products, but requires only a fraction of the computational resources.
{"title":"A Real-Time Spatio-Temporal Machine Learning Framework for the Prediction of Nearshore Wave Conditions","authors":"Jiaxin Chen, I. Ashton, E. Steele, A. Pillai","doi":"10.1175/aies-d-22-0033.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0033.1","url":null,"abstract":"\u0000The safe and successful operation of offshore infrastructure relies on a detailed awareness of ocean wave conditions. Ongoing growth in offshore wind energy is focused on very large-scale projects, deployed in ever-more challenging environments. This inherently increases both cost and complexity, and therefore the requirement for efficient operational planning. To support this, we propose a new machine learning framework for the short-term forecasting of ocean wave conditions, to support critical decision-making associated with marine operations. Here, an attention-based Long Short-Term Memory (LSTM) neural network approach is used to learn the short-term temporal patterns from in-situ observations. This is then integrated with an existing, low-computational cost spatial nowcasting model to develop a complete framework for spatio-temporal forecasting. The framework addresses the challenge of filling gaps in the in-situ observations, and undertakes feature selection, with seasonal training datasets embedded. The full spatio-temporal forecasting system is demonstrated using a case study based on independent observation locations near the southwest coast of the United Kingdom. Results are validated against in-situ data from two wave buoy locations within the domain and compared to operational physics-based wave forecasts from the Met Office (the UK’s national weather service). For these two example locations, the spatio-temporal forecast is found to have the accuracy of R2 0.9083 and 0.7409 in forecasting 1 hour ahead significant wave height, and R2 0.8581 and 0.6978 in 12 hour ahead forecasts, respectively. Importantly, this represents respectable levels of accuracy, comparable to traditional physics-based forecast products, but requires only a fraction of the computational resources.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86555421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1175/aies-d-22-0050.1
Yongquan Qu, X. Shi
The development of machine learning (ML) techniques enables data-driven parameterizations, which have been investigated in many recent studies. Some investigations suggest that a priori trained ML models exhibit satisfying accuracy during training but poor performance when coupled to dynamical cores and tested. Here we use the evolution of the barotropic vorticity equation (BVE) with periodically reinforced shear instability as a prototype problem to develop and evaluate a model-consistent training strategy, which employs a numerical solver supporting automatic differentiation and includes the solver in the loss function for training ML-based subgrid-scale (SGS) turbulence models. This approach enables the interaction between the dynamical core and the ML-based parameterization during the model training phase. The BVE model was run at low, high, and ultra-high (truth) resolutions. Our training dataset contains only a short period of coarsened high-resolution simulations. However, given initial conditions long after the training dataset time, the trained SGS model can still significantly increase the effective lead time of the BVE model running at the low resolution by up to 50% compared to the BVE simulation without an SGS model. We also tested using a covariance matrix to normalize the loss function and found it can notably boost the performance of the ML parameterization. The SGS model’s performance is further improved by conducting transfer learning using a limited number of discontinuous observations, increasing the forecast lead time improvement to 73%. This study demonstrates a potential pathway to using machine learning to enhance the prediction skills of our climate and weather models.
{"title":"Can a Machine-Learning-Enabled Numerical Model Help Extend Effective Forecast Range through Consistently Trained Subgrid-Scale Models?","authors":"Yongquan Qu, X. Shi","doi":"10.1175/aies-d-22-0050.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0050.1","url":null,"abstract":"\u0000The development of machine learning (ML) techniques enables data-driven parameterizations, which have been investigated in many recent studies. Some investigations suggest that a priori trained ML models exhibit satisfying accuracy during training but poor performance when coupled to dynamical cores and tested. Here we use the evolution of the barotropic vorticity equation (BVE) with periodically reinforced shear instability as a prototype problem to develop and evaluate a model-consistent training strategy, which employs a numerical solver supporting automatic differentiation and includes the solver in the loss function for training ML-based subgrid-scale (SGS) turbulence models. This approach enables the interaction between the dynamical core and the ML-based parameterization during the model training phase. The BVE model was run at low, high, and ultra-high (truth) resolutions. Our training dataset contains only a short period of coarsened high-resolution simulations. However, given initial conditions long after the training dataset time, the trained SGS model can still significantly increase the effective lead time of the BVE model running at the low resolution by up to 50% compared to the BVE simulation without an SGS model. We also tested using a covariance matrix to normalize the loss function and found it can notably boost the performance of the ML parameterization. The SGS model’s performance is further improved by conducting transfer learning using a limited number of discontinuous observations, increasing the forecast lead time improvement to 73%. This study demonstrates a potential pathway to using machine learning to enhance the prediction skills of our climate and weather models.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79719859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-23DOI: 10.1175/aies-d-22-0084.1
M. Veillette, J. Kurdzo, P. Stepanian, Joseph McDonald, S. Samsi, John Y. N. Cho
Radial velocity estimates provided by Doppler weather radar are critical measurements used by operational forecasters for the detection and monitoring of life-impacting storms. The sampling methods used to produce these measurements are inherently susceptible to aliasing, which produces ambiguous velocity values in regions with high winds, and needs to be corrected using a velocity dealiasing algorithm (VDA). In the US, the Weather Surveillance Radar – 1988 Doppler (WSR-88D) Open Radar Product Generator (ORPG) is a processing environment that provides a world-class VDA; however, this algorithm is complex and can be difficult to port to other radar systems outside of the WSR-88D network. In this work, a Deep Neural Network (DNN) is used to emulate the 2-dimensionalWSR-88D ORPG dealiasing algorithm. It is shown that a DNN, specifically a customized U-Net, is highly effective for building VDAs that are accurate, fast, and portable to multiple radar types. To train the DNN model, a large dataset is generated containing aligned samples of folded and dealiased velocity pairs. This dataset contains samples collected from WSR-88D Level-II and Level-III archives, and uses the ORPG dealiasing algorithm output as a source of truth. Using this dataset, a U-Net is trained to produce the number of folds at each point of a velocity image. Several performance metrics are presented using WSR-88D data. The algorithm is also applied to other non-WSR-88D radar systems to demonstrate portability to other hardware/software interfaces. A discussion of the broad applicability of this method is presented, including how other Level-III algorithms may benefit from this approach.
{"title":"A Deep Learning-based Velocity Dealiasing Algorithm Derived from the WSR-88D Open Radar Product Generator","authors":"M. Veillette, J. Kurdzo, P. Stepanian, Joseph McDonald, S. Samsi, John Y. N. Cho","doi":"10.1175/aies-d-22-0084.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0084.1","url":null,"abstract":"\u0000Radial velocity estimates provided by Doppler weather radar are critical measurements used by operational forecasters for the detection and monitoring of life-impacting storms. The sampling methods used to produce these measurements are inherently susceptible to aliasing, which produces ambiguous velocity values in regions with high winds, and needs to be corrected using a velocity dealiasing algorithm (VDA). In the US, the Weather Surveillance Radar – 1988 Doppler (WSR-88D) Open Radar Product Generator (ORPG) is a processing environment that provides a world-class VDA; however, this algorithm is complex and can be difficult to port to other radar systems outside of the WSR-88D network. In this work, a Deep Neural Network (DNN) is used to emulate the 2-dimensionalWSR-88D ORPG dealiasing algorithm. It is shown that a DNN, specifically a customized U-Net, is highly effective for building VDAs that are accurate, fast, and portable to multiple radar types. To train the DNN model, a large dataset is generated containing aligned samples of folded and dealiased velocity pairs. This dataset contains samples collected from WSR-88D Level-II and Level-III archives, and uses the ORPG dealiasing algorithm output as a source of truth. Using this dataset, a U-Net is trained to produce the number of folds at each point of a velocity image. Several performance metrics are presented using WSR-88D data. The algorithm is also applied to other non-WSR-88D radar systems to demonstrate portability to other hardware/software interfaces. A discussion of the broad applicability of this method is presented, including how other Level-III algorithms may benefit from this approach.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"249 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80018186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-23DOI: 10.1175/aies-d-21-0009.1
S. Vijverberg, Raed Hamed, D. Coumou
Soy harvest failure events can severely impact farmers, insurance companies and raise global prices. Reliable seasonal forecasts of mis-harvests would allow stakeholders to prepare and take appropriate early action. However, especially for farmers, the reliability and lead-time of current prediction systems provide insufficient information to justify within-season adaptation measures. Recent innovations increased our ability to generate reliable statistical seasonal forecasts. Here, we combine these innovations to predict the 1-3 poor soy harvest years in eastern US. We first use a clustering algorithm to spatially aggregate crop producing regions within the eastern US that are particularly sensitive to hot-dry weather conditions. Next, we use observational climate variables (sea surface temperature (SST) and soil moisture) to extract precursor timeseries at multiple lags. This allows the machine learning model to learn the low-frequency evolution, which carries important information for predictability. A selection based on causal inference allows for physically interpretable precursors. We show that the robust selected predictors are associated with the evolution of the horseshoe Pacific SST pattern, in line with previous research. We use the state of the horseshoe Pacific to identify years with enhanced predictability. We achieve very high forecast skill of poor harvests events, even 3 months prior to sowing, using a strict one-step-ahead train-test splitting. Over the last 25 years, 82% of the in February predicted poor harvests were correct. When operational, this forecast would enable farmers (and insurance/trading companies) to make informed decisions on adaption measures, e.g., selecting more drought-resistant cultivars, invest in insurance, change planting management.
{"title":"Skillful US Soy-yield Forecasts at Pre-sowing Lead-times","authors":"S. Vijverberg, Raed Hamed, D. Coumou","doi":"10.1175/aies-d-21-0009.1","DOIUrl":"https://doi.org/10.1175/aies-d-21-0009.1","url":null,"abstract":"\u0000Soy harvest failure events can severely impact farmers, insurance companies and raise global prices. Reliable seasonal forecasts of mis-harvests would allow stakeholders to prepare and take appropriate early action. However, especially for farmers, the reliability and lead-time of current prediction systems provide insufficient information to justify within-season adaptation measures. Recent innovations increased our ability to generate reliable statistical seasonal forecasts. Here, we combine these innovations to predict the 1-3 poor soy harvest years in eastern US. We first use a clustering algorithm to spatially aggregate crop producing regions within the eastern US that are particularly sensitive to hot-dry weather conditions. Next, we use observational climate variables (sea surface temperature (SST) and soil moisture) to extract precursor timeseries at multiple lags. This allows the machine learning model to learn the low-frequency evolution, which carries important information for predictability. A selection based on causal inference allows for physically interpretable precursors. We show that the robust selected predictors are associated with the evolution of the horseshoe Pacific SST pattern, in line with previous research. We use the state of the horseshoe Pacific to identify years with enhanced predictability. We achieve very high forecast skill of poor harvests events, even 3 months prior to sowing, using a strict one-step-ahead train-test splitting. Over the last 25 years, 82% of the in February predicted poor harvests were correct. When operational, this forecast would enable farmers (and insurance/trading companies) to make informed decisions on adaption measures, e.g., selecting more drought-resistant cultivars, invest in insurance, change planting management.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87408958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1175/aies-d-22-0023.1
Yanan Duan, S. Akula, Sanjiv Kumar, Wonjun Lee, Sepideh Khajehei
The National Oceanic and Atmospheric Administration have developed a very high-resolution streamflow forecast using National Water Model (NWM) for 2.7 million stream locations in the United States. However, considerable challenges exist for quantifying uncertainty at ungauged locations and forecast reliability. A data science approach is presented to address the challenge. The long-range daily streamflow forecasts are analyzed from Dec. 2018 to Aug. 2021 for Alabama and Georgia. The forecast is evaluated at 389 observed USGS stream gauging locations using standard deterministic metrics. Next, the forecast errors are grouped using watersheds’ biophysical characteristics, including drainage area, land use, soil type, and topographic index. The NWM forecasts are more skillful for larger and forested watersheds than smaller and urban watersheds. The NWM forecast considerably overestimates the streamflow in the urban watersheds. The classification and regression tree analysis confirm the dependency of the forecast errors on the biophysical characteristics. A densely connected neural network model consisting of 6 layers (Deep Learning, DL) is developed using biophysical characteristics, NWM forecast as inputs, and the forecast errors as outputs. The DL model successfully learns location invariant transferrable knowledge from the domain trained in the gauged locations and applies the learned model to estimate forecast errors at the ungauged locations. A temporal and spatial split of the gauged data shows that the probability of capturing the observations in the forecast range improved significantly in the hybrid NWM-DL model (82±3 %) than in the NWM-only forecast (21±1 %). A tradeoff between overly constrained NWM forecast and increased forecast uncertainty range in the DL model is noted.
{"title":"A Hybrid Physics-AI Model to Improve Hydrological Forecasts","authors":"Yanan Duan, S. Akula, Sanjiv Kumar, Wonjun Lee, Sepideh Khajehei","doi":"10.1175/aies-d-22-0023.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0023.1","url":null,"abstract":"\u0000The National Oceanic and Atmospheric Administration have developed a very high-resolution streamflow forecast using National Water Model (NWM) for 2.7 million stream locations in the United States. However, considerable challenges exist for quantifying uncertainty at ungauged locations and forecast reliability. A data science approach is presented to address the challenge. The long-range daily streamflow forecasts are analyzed from Dec. 2018 to Aug. 2021 for Alabama and Georgia. The forecast is evaluated at 389 observed USGS stream gauging locations using standard deterministic metrics. Next, the forecast errors are grouped using watersheds’ biophysical characteristics, including drainage area, land use, soil type, and topographic index. The NWM forecasts are more skillful for larger and forested watersheds than smaller and urban watersheds. The NWM forecast considerably overestimates the streamflow in the urban watersheds. The classification and regression tree analysis confirm the dependency of the forecast errors on the biophysical characteristics.\u0000A densely connected neural network model consisting of 6 layers (Deep Learning, DL) is developed using biophysical characteristics, NWM forecast as inputs, and the forecast errors as outputs. The DL model successfully learns location invariant transferrable knowledge from the domain trained in the gauged locations and applies the learned model to estimate forecast errors at the ungauged locations. A temporal and spatial split of the gauged data shows that the probability of capturing the observations in the forecast range improved significantly in the hybrid NWM-DL model (82±3 %) than in the NWM-only forecast (21±1 %). A tradeoff between overly constrained NWM forecast and increased forecast uncertainty range in the DL model is noted.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88778582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06DOI: 10.1175/aies-d-21-0004.1
C. Marzban, Jueyi Liu, P. Tissot
Resampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.
{"title":"On Variability due to Local Minima and K-fold Cross-validation","authors":"C. Marzban, Jueyi Liu, P. Tissot","doi":"10.1175/aies-d-21-0004.1","DOIUrl":"https://doi.org/10.1175/aies-d-21-0004.1","url":null,"abstract":"\u0000Resampling methods such as cross-validation or bootstrap are often employed to estimate the uncertainty in a loss function due to sampling variability, usually for the purpose of model selection. But in models that require nonlinear optimization, the existence of local minima in the loss function landscape introduces an additional source of variability which is confounded with sampling variability. In other words, some portion of the variability in the loss function across different resamples is due to local minima. Given that statistically-sound model selection is based on an examination of variance, it is important to disentangle these two sources of variability. To that end, a methodology is developed for estimating each, specifically in the context of K-fold cross-validation, and Neural Networks (NN) whose training leads to different local minima. Random effects models are used to estimate the two variance components - due to sampling and due to local minima. The results are examined as a function of the number of hidden nodes, and the variance of the initial weights, with the latter controlling the “depth” of local minima. The main goal of the methodology is to increase statistical power in model selection and/or model comparison. Using both simulated and realistic data it is shown that the two sources of variability can be comparable, casting doubt on model selection methods that ignore the variability due to local minima. Furthermore, the methodology is sufficiently flexible so as to allow assessment of the effect of other/any NN parameters on variability.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"83 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79008232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06DOI: 10.1175/aies-d-22-0018.1
Ophélia Miralles, Daniel Steinfield, O. Martius, A. Davison
Near-surface wind is difficult to estimate using global numerical weather and climate models, as airflow is strongly modified by underlying topography, especially that of a country such as Switzerland. In this article, we use a statistical approach based on deep learning and a high-resolution Digital Elevation Model to spatially downscale hourly near-surface wind fields at coarse resolution from ERA5 reanalysis from their original 25 km to a 1.1 km grid. A 1.1 km resolution wind dataset for 2016–2020 from the operational numerical weather prediction model COSMO-1 of the national weather service, MeteoSwiss, is used to train and validate our model, a generative adversarial network (GAN) with gradient penalized Wasserstein loss aided by transfer learning. The results are realistic-looking high-resolution historical maps of gridded hourly wind fields over Switzerland and very good and robust predictions of the aggregated wind speed distribution. Regionally averaged image-specific metrics show a clear improvement in prediction compared to ERA5, with skill measures generally better for locations over the flatter Swiss Plateau than for Alpine regions. The downscaled wind fields demonstrate higher-resolution, physically plausible orographic effects, such as ridge acceleration and sheltering, which are not resolved in the original ERA5 fields.
{"title":"Downscaling of Historical Wind Fields over Switzerland using Generative Adversarial Networks","authors":"Ophélia Miralles, Daniel Steinfield, O. Martius, A. Davison","doi":"10.1175/aies-d-22-0018.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0018.1","url":null,"abstract":"\u0000Near-surface wind is difficult to estimate using global numerical weather and climate models, as airflow is strongly modified by underlying topography, especially that of a country such as Switzerland. In this article, we use a statistical approach based on deep learning and a high-resolution Digital Elevation Model to spatially downscale hourly near-surface wind fields at coarse resolution from ERA5 reanalysis from their original 25 km to a 1.1 km grid. A 1.1 km resolution wind dataset for 2016–2020 from the operational numerical weather prediction model COSMO-1 of the national weather service, MeteoSwiss, is used to train and validate our model, a generative adversarial network (GAN) with gradient penalized Wasserstein loss aided by transfer learning. The results are realistic-looking high-resolution historical maps of gridded hourly wind fields over Switzerland and very good and robust predictions of the aggregated wind speed distribution. Regionally averaged image-specific metrics show a clear improvement in prediction compared to ERA5, with skill measures generally better for locations over the flatter Swiss Plateau than for Alpine regions. The downscaled wind fields demonstrate higher-resolution, physically plausible orographic effects, such as ridge acceleration and sheltering, which are not resolved in the original ERA5 fields.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82954801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06DOI: 10.1175/aies-d-22-0011.1
K. Pegion, E. Becker, B. Kirtman
We investigate the predictability of the sign of daily South-East US (SEUS) precipitation anomalies associated with simultaneous predictors of large-scale climate variability using machine learning models. Models using index-based climate predictors and gridded fields of large-scale circulation as predictors are utilized. Logistic regression (LR) and fully connected neural networks using indices of climate phenomena as predictors produce neither accurate nor reliable predictions, indicating that the indices themselves are not good predictors. Using gridded fields as predictors, a LR and convolutional neural network (CNN) are more accurate than the index based models. However, only the CNN can produce reliable predictions which can be used to identify forecasts of opportunity. Using explainable machine learning we identify which variables and gridpoints of the input fields are most relevant for confident and correct predictions in the CNN. Our results show that the local circulation is most important as represented by maximum relevance of 850 hPa geopotential heights and zonal winds to making skillful, high probability predictions. Corresponding composite anomalies identify connections with the El-Niño Southern Oscillation during winter and the Atlantic Multidecadal Oscillation and North Atlantic Subtropical High during summer.
{"title":"Understanding Predictability of Daily Southeast US Precipitation using Explainable Machine Learning","authors":"K. Pegion, E. Becker, B. Kirtman","doi":"10.1175/aies-d-22-0011.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0011.1","url":null,"abstract":"\u0000We investigate the predictability of the sign of daily South-East US (SEUS) precipitation anomalies associated with simultaneous predictors of large-scale climate variability using machine learning models. Models using index-based climate predictors and gridded fields of large-scale circulation as predictors are utilized. Logistic regression (LR) and fully connected neural networks using indices of climate phenomena as predictors produce neither accurate nor reliable predictions, indicating that the indices themselves are not good predictors. Using gridded fields as predictors, a LR and convolutional neural network (CNN) are more accurate than the index based models. However, only the CNN can produce reliable predictions which can be used to identify forecasts of opportunity. Using explainable machine learning we identify which variables and gridpoints of the input fields are most relevant for confident and correct predictions in the CNN. Our results show that the local circulation is most important as represented by maximum relevance of 850 hPa geopotential heights and zonal winds to making skillful, high probability predictions. Corresponding composite anomalies identify connections with the El-Niño Southern Oscillation during winter and the Atlantic Multidecadal Oscillation and North Atlantic Subtropical High during summer.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90994690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}