Pub Date : 2023-07-28DOI: 10.1175/aies-d-23-0020.1
S. Kiefer, Sebastian Lerch, P. Ludwig, J. Pinto
Skillful weather prediction on subseasonal to seasonal timescales is crucial for many socio-economic ventures. But forecasting, especially extremes, on these timescales is very challenging as the information from initial conditions is gradually lost. Therefore, data-driven methods are discussed as an alternative to numerical weather prediction models. Here, Quantile Regression Forests (QRFs) and Random Forest Classifiers (RFCs) are used for probabilistic forecasting of Central European wintertime mean 2-meter temperatures and cold wave days at lead times of 14, 21 and 28 days. ERA5-reanalysis meteorological predictors are used as input data for the machine learning models. The predictions are compared for the winters 2000/2001 to 2019/2020 to a climatological ensemble obtained from E-OBS observational data. The evaluation is performed as full distribution predictions for continuous values using the Continuous Ranked Probability Skill Score and as binary categorical forecasts using the Brier Skill Score. We find skill at lead times up to 28 days in the 20-winter mean and for individual winters. Case studies show that all used machine learning models are able to learn pattern in the data beyond climatology. A more detailed analysis using Shapley Additive Explanations suggest, that both Random-Forest (RF) based models are able to learn physically known relationships in the data. This underlines that RF-based data-driven models can be a suitable tool for forecasting Central European wintertime 2-meter temperatures and the occurrence of cold wave days.
{"title":"Can Machine Learning Models be a Suitable Tool for Predicting Central European Cold Winter Weather on Subseasonal to Seasonal Timescales?","authors":"S. Kiefer, Sebastian Lerch, P. Ludwig, J. Pinto","doi":"10.1175/aies-d-23-0020.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0020.1","url":null,"abstract":"\u0000Skillful weather prediction on subseasonal to seasonal timescales is crucial for many socio-economic ventures. But forecasting, especially extremes, on these timescales is very challenging as the information from initial conditions is gradually lost. Therefore, data-driven methods are discussed as an alternative to numerical weather prediction models. Here, Quantile Regression Forests (QRFs) and Random Forest Classifiers (RFCs) are used for probabilistic forecasting of Central European wintertime mean 2-meter temperatures and cold wave days at lead times of 14, 21 and 28 days. ERA5-reanalysis meteorological predictors are used as input data for the machine learning models. The predictions are compared for the winters 2000/2001 to 2019/2020 to a climatological ensemble obtained from E-OBS observational data. The evaluation is performed as full distribution predictions for continuous values using the Continuous Ranked Probability Skill Score and as binary categorical forecasts using the Brier Skill Score. We find skill at lead times up to 28 days in the 20-winter mean and for individual winters. Case studies show that all used machine learning models are able to learn pattern in the data beyond climatology. A more detailed analysis using Shapley Additive Explanations suggest, that both Random-Forest (RF) based models are able to learn physically known relationships in the data. This underlines that RF-based data-driven models can be a suitable tool for forecasting Central European wintertime 2-meter temperatures and the occurrence of cold wave days.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79200401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-24DOI: 10.1175/aies-d-22-0056.1
P. Ortiz, Eleanor Casas, M. Orescanin, S. Powell, V. Petković, Micky Hall
Visible and infrared radiance products of geostationary orbiting platforms provide virtually continuous observations of Earth. In contrast, low Earth orbiters observe passive microwave (PMW) radiances at any location much less frequently. Prior literature demonstrates the ability of a Machine Learning (ML) approach to build a link between these two complementary radiance spectra by predicting PMW observations using infrared and visible products collected from geostationary instruments, which could potentially deliver a highly-desirable synthetic PMW product with nearly continuous spatio-temporal coverage. However, current ML models lack the ability to provide a measure of uncertainty of such a product, significantly limiting its applications. In this work, Bayesian Deep Learning is employed to generate synthetic Global Precipitation Measurement (GPM) mission Microwave Imager (GMI) data from Advanced Baseline Imager (ABI) observations with attached uncertainties over the ocean. The study first uses deterministic Residual Networks (ResNets) to generate synthetic GMI brightness temperatures with as little mean absolute error as 1.72 K at the ABI spatio-temporal resolution. Then, for the same task, we use three Bayesian ResNet models to produce a comparable amount of error while providing previously unavailable predictive variance (i.e. uncertainty) for each synthetic data point. We find that the Flipout configuration provides the most robust calibration between uncertainty and error across GMI frequencies, and then demonstrate how this additional information is useful for discarding high-error synthetic data points prior to use by downstream applications.
{"title":"Uncertainty Calibration of Passive Microwave Brightness Temperatures Predicted by Bayesian Deep Learning Models","authors":"P. Ortiz, Eleanor Casas, M. Orescanin, S. Powell, V. Petković, Micky Hall","doi":"10.1175/aies-d-22-0056.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0056.1","url":null,"abstract":"\u0000Visible and infrared radiance products of geostationary orbiting platforms provide virtually continuous observations of Earth. In contrast, low Earth orbiters observe passive microwave (PMW) radiances at any location much less frequently. Prior literature demonstrates the ability of a Machine Learning (ML) approach to build a link between these two complementary radiance spectra by predicting PMW observations using infrared and visible products collected from geostationary instruments, which could potentially deliver a highly-desirable synthetic PMW product with nearly continuous spatio-temporal coverage. However, current ML models lack the ability to provide a measure of uncertainty of such a product, significantly limiting its applications. In this work, Bayesian Deep Learning is employed to generate synthetic Global Precipitation Measurement (GPM) mission Microwave Imager (GMI) data from Advanced Baseline Imager (ABI) observations with attached uncertainties over the ocean. The study first uses deterministic Residual Networks (ResNets) to generate synthetic GMI brightness temperatures with as little mean absolute error as 1.72 K at the ABI spatio-temporal resolution. Then, for the same task, we use three Bayesian ResNet models to produce a comparable amount of error while providing previously unavailable predictive variance (i.e. uncertainty) for each synthetic data point. We find that the Flipout configuration provides the most robust calibration between uncertainty and error across GMI frequencies, and then demonstrate how this additional information is useful for discarding high-error synthetic data points prior to use by downstream applications.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87552945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-17DOI: 10.1175/aies-d-23-0003.1
L. Passarella, S. Mahajan
We construct a novel Multi-Input Multi-Output Autoencoder-decoder (MIMO-AE) to capture the non-linear relationship of Southern California precipitation and tropical Pacific Ocean sea surface temperature. The MIMO-AE is trained on both monthly TP-SST and SC-PRECIP anomalies simultaneously. The co-variability of the two fields in the MIMO-AE shared nonlinear latent space can be condensed into an index, termed the MIMO-AE index. We use a transfer learning approach to train a MIMO-AE on the combined dataset of 100 years of output from a historical simulation with the Energy Exascale Earth Systems Model version 1 and a segment of observational data. We further use Long Short-Term Memory networks to assess sub-seasonal predictability of SC-PRECIP using the MIMO-AE index. We find that the MIMO-AE index provides enhanced predictability of SC-PRECIP for a lead-time of up-to four months as compared to Niño 3.4 index and the El Niño Southern Oscillation Longitudinal Index.
{"title":"Assessing Tropical Pacific-induced Predictability of Southern California Precipitation Using a Novel Multi-input Multi-output Autoencoder","authors":"L. Passarella, S. Mahajan","doi":"10.1175/aies-d-23-0003.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0003.1","url":null,"abstract":"\u0000We construct a novel Multi-Input Multi-Output Autoencoder-decoder (MIMO-AE) to capture the non-linear relationship of Southern California precipitation and tropical Pacific Ocean sea surface temperature. The MIMO-AE is trained on both monthly TP-SST and SC-PRECIP anomalies simultaneously. The co-variability of the two fields in the MIMO-AE shared nonlinear latent space can be condensed into an index, termed the MIMO-AE index. We use a transfer learning approach to train a MIMO-AE on the combined dataset of 100 years of output from a historical simulation with the Energy Exascale Earth Systems Model version 1 and a segment of observational data. We further use Long Short-Term Memory networks to assess sub-seasonal predictability of SC-PRECIP using the MIMO-AE index. We find that the MIMO-AE index provides enhanced predictability of SC-PRECIP for a lead-time of up-to four months as compared to Niño 3.4 index and the El Niño Southern Oscillation Longitudinal Index.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75259421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-10DOI: 10.1175/aies-d-23-0026.1
Lily-belle Sweet, Christoph Müller, Mohit Anand, J. Zscheischler
Machine learning algorithms are able to capture complex, nonlinear interacting relationships and are increasingly used to predict yield variability at regional and national scales. Using explainable artificial intelligence (XAI) methods applied to such algorithms may enable better scientific understanding of drivers of yield variability. However, XAI methods may provide misleading results when applied to spatiotemporal correlated datasets. In this study, machine learning models are trained to predict simulated crop yield from climate indices, and the impact of model evaluation strategy on the interpretation and performance of the resulting models is assessed. Using data from a process-based crop model allows us to then comment on the plausibility of the ‘explanations’ provided by XAI methods. Our results show that the choice of evaluation strategy has an impact on (i) interpretations of the model and (ii) model skill on heldout years and regions, after the evaluation strategy is used for hyperparameter-tuning and feature-selection. We find that use of a cross-validation strategy based on clustering in feature-space achieves the most plausible interpretations as well as the best model performance on heldout years and regions. Our results provide first steps towards identifying domain-specific ‘best practices’ for the use of XAI tools on spatiotemporal agricultural or climatic data.
{"title":"Cross-validation strategy impacts the performance and interpretation of machine learning models","authors":"Lily-belle Sweet, Christoph Müller, Mohit Anand, J. Zscheischler","doi":"10.1175/aies-d-23-0026.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0026.1","url":null,"abstract":"\u0000Machine learning algorithms are able to capture complex, nonlinear interacting relationships and are increasingly used to predict yield variability at regional and national scales. Using explainable artificial intelligence (XAI) methods applied to such algorithms may enable better scientific understanding of drivers of yield variability. However, XAI methods may provide misleading results when applied to spatiotemporal correlated datasets. In this study, machine learning models are trained to predict simulated crop yield from climate indices, and the impact of model evaluation strategy on the interpretation and performance of the resulting models is assessed. Using data from a process-based crop model allows us to then comment on the plausibility of the ‘explanations’ provided by XAI methods. Our results show that the choice of evaluation strategy has an impact on (i) interpretations of the model and (ii) model skill on heldout years and regions, after the evaluation strategy is used for hyperparameter-tuning and feature-selection. We find that use of a cross-validation strategy based on clustering in feature-space achieves the most plausible interpretations as well as the best model performance on heldout years and regions. Our results provide first steps towards identifying domain-specific ‘best practices’ for the use of XAI tools on spatiotemporal agricultural or climatic data.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87621783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-06DOI: 10.1175/aies-d-22-0079.1
C. Stan, Rama Sesha Sridhar Mantripragada
This paper presents a novel application of convolutional neural network (CNN) models for filtering the intraseasonal variability of the tropical atmosphere. In this deep learning filter, two convolutional layers are applied sequentially in a supervised machine learning framework to extract the intraseasonal signal from the total daily anomalies. The CNN-based filter can be tailored for each field similarly to fast Fourier transform filtering methods. When applied to two different fields (zonal wind stress and outgoing longwave radiation), the index of agreement between the filtered signal obtained using the CNN-based filter and a conventional weight-based filter is between 95 – 99%. The advantage of the CNN-based filter over the conventional filters is its applicability to time series with the length comparable to the period of the signal being extracted.
{"title":"A deep learning filter for the intraseasonal variability of the tropics","authors":"C. Stan, Rama Sesha Sridhar Mantripragada","doi":"10.1175/aies-d-22-0079.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0079.1","url":null,"abstract":"\u0000This paper presents a novel application of convolutional neural network (CNN) models for filtering the intraseasonal variability of the tropical atmosphere. In this deep learning filter, two convolutional layers are applied sequentially in a supervised machine learning framework to extract the intraseasonal signal from the total daily anomalies. The CNN-based filter can be tailored for each field similarly to fast Fourier transform filtering methods. When applied to two different fields (zonal wind stress and outgoing longwave radiation), the index of agreement between the filtered signal obtained using the CNN-based filter and a conventional weight-based filter is between 95 – 99%. The advantage of the CNN-based filter over the conventional filters is its applicability to time series with the length comparable to the period of the signal being extracted.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85258036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-01DOI: 10.1175/aies-d-22-0045.1
Daniel Galea, Julian Kunkel, B. Lawrence
Tropical cyclones are high-impact weather events that have large human and economic effects, so it is important to be able to understand how their location, frequency, and structure might change in a future climate. Here, a lightweight deep learning model is presented that is intended for detecting the presence or absence of tropical cyclones during the execution of numerical simulations for use in an online data reduction method. This will help to avoid saving vast amounts of data for analysis after the simulation is complete. With run-time detection, it might be possible to reduce the need for some of the high-frequency high-resolution output that would otherwise be required. The model was trained on ERA-Interim reanalysis data from 1979 to 2017, and the training was concentrated on delivering the highest possible recall rate (successful detection of cyclones) while rejecting enough data to make a difference in outputs. When tested using data from the two subsequent years, the recall or probability of detection rate was 92%. The precision rate or success ratio obtained was that of 36%. For the desired data reduction application, if the desired target included all tropical cyclone events, even those that did not obtain hurricane-strength status, the effective precision was 85%. The recall rate and the area under curve for the precision–recall (AUC-PR) compare favorably with other methods of cyclone identification while using the smallest number of parameters for both training and inference.
{"title":"TCDetect: A New Method of Detecting the Presence of Tropical Cyclones Using Deep Learning","authors":"Daniel Galea, Julian Kunkel, B. Lawrence","doi":"10.1175/aies-d-22-0045.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0045.1","url":null,"abstract":"\u0000Tropical cyclones are high-impact weather events that have large human and economic effects, so it is important to be able to understand how their location, frequency, and structure might change in a future climate. Here, a lightweight deep learning model is presented that is intended for detecting the presence or absence of tropical cyclones during the execution of numerical simulations for use in an online data reduction method. This will help to avoid saving vast amounts of data for analysis after the simulation is complete. With run-time detection, it might be possible to reduce the need for some of the high-frequency high-resolution output that would otherwise be required. The model was trained on ERA-Interim reanalysis data from 1979 to 2017, and the training was concentrated on delivering the highest possible recall rate (successful detection of cyclones) while rejecting enough data to make a difference in outputs. When tested using data from the two subsequent years, the recall or probability of detection rate was 92%. The precision rate or success ratio obtained was that of 36%. For the desired data reduction application, if the desired target included all tropical cyclone events, even those that did not obtain hurricane-strength status, the effective precision was 85%. The recall rate and the area under curve for the precision–recall (AUC-PR) compare favorably with other methods of cyclone identification while using the smallest number of parameters for both training and inference.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77081320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tributary phosphorus (P) loads are one of the main drivers of eutrophication problems in freshwater lakes. Being able to predict P loads can aid in understanding subsequent load patterns and elucidate potential degraded water quality conditions in downstream surface waters. We demonstrate the development and performance of an integrated multimedia modeling system that uses machine learning (ML) to assess and predict monthly total P (TP) and dissolved reactive P (DRP) loads. Meteorological variables from the Weather Research and Forecasting (WRF) Model, hydrologic variables from the Variable Infiltration Capacity model, and agricultural management practice variables from the Environmental Policy Integrated Climate agroecosystem model are utilized to train the ML models to predict P loads. Our study presents a new modeling methodology using as testbeds the Maumee, Sandusky, Portage, and Raisin watersheds, which discharge into Lake Erie and contribute to significant P loads to the lake. Two models were built, one for TP loads using 10 environmental variables and one for DRP loads using nine environmental variables. Both models ranked streamflow as the most important predictive variable. In comparison with observations, TP and DRP loads were predicted very well temporally and spatially. Modeling results of TP loads are within the ranges of those obtained from other studies and on some occasions more accurate. Modeling results of DRP loads exceed performance measures from other studies. We explore the ability of both ML-based models to further improve as more data become available over time. This integrated multimedia approach is recommended for studying other freshwater systems and water quality variables using available decadal data from physics-based model simulations.
{"title":"A New Approach to Predict Tributary Phosphorus Loads Using Machine Learning- and Physics-Based Modeling Systems.","authors":"Christina Feng Chang, Marina Astitha, Yongping Yuan, Chunling Tang, Penny Vlahos, Valerie Garcia, Ummul Khaira","doi":"10.1175/aies-d-22-0049.1","DOIUrl":"10.1175/aies-d-22-0049.1","url":null,"abstract":"<p><p>Tributary phosphorus (P) loads are one of the main drivers of eutrophication problems in freshwater lakes. Being able to predict P loads can aid in understanding subsequent load patterns and elucidate potential degraded water quality conditions in downstream surface waters. We demonstrate the development and performance of an integrated multimedia modeling system that uses machine learning (ML) to assess and predict monthly total P (TP) and dissolved reactive P (DRP) loads. Meteorological variables from the Weather Research and Forecasting (WRF) Model, hydrologic variables from the Variable Infiltration Capacity model, and agricultural management practice variables from the Environmental Policy Integrated Climate agroecosystem model are utilized to train the ML models to predict P loads. Our study presents a new modeling methodology using as testbeds the Maumee, Sandusky, Portage, and Raisin watersheds, which discharge into Lake Erie and contribute to significant P loads to the lake. Two models were built, one for TP loads using 10 environmental variables and one for DRP loads using nine environmental variables. Both models ranked streamflow as the most important predictive variable. In comparison with observations, TP and DRP loads were predicted very well temporally and spatially. Modeling results of TP loads are within the ranges of those obtained from other studies and on some occasions more accurate. Modeling results of DRP loads exceed performance measures from other studies. We explore the ability of both ML-based models to further improve as more data become available over time. This integrated multimedia approach is recommended for studying other freshwater systems and water quality variables using available decadal data from physics-based model simulations.</p>","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"2 3","pages":"1-20"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10569129/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41242940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-30DOI: 10.1175/aies-d-22-0009.1
O. Chkrebtii, F. Bingham
We explore the use of ocean near-surface salinity (NSS), i.e. salinity at 1 m depth, as a rainfall occurrence detector for hourly precipitation using data from the SPURS-2 (Salinity Processes in the Upper-ocean Regional Studies - 2) mooring at 10°N,125°W. Our proposed unsupervised learning algorithm consisting of two stages. First, an empirical quantile-based identification of dips in NSS enables us to capture most events with hourly averaged rainfall rate > 5 mm/hr. Over-estimation of precipitation duration is then corrected locally by fitting a parametric model based on the salinity balance equation. We propose a local precipitation model composed of a small number of calibration parameters representing individual rainfall events and their location in time. We show that unsupervised rainfall detection can be formulated as a statistical problem of predicting these variables from NSS data. We present our results and provide a validation technique based on data collected at the SPURS-2 mooring.
{"title":"Automatic detection of rainfall at hourly time scales from mooring near-surface salinity in the eastern tropical Pacific","authors":"O. Chkrebtii, F. Bingham","doi":"10.1175/aies-d-22-0009.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0009.1","url":null,"abstract":"\u0000We explore the use of ocean near-surface salinity (NSS), i.e. salinity at 1 m depth, as a rainfall occurrence detector for hourly precipitation using data from the SPURS-2 (Salinity Processes in the Upper-ocean Regional Studies - 2) mooring at 10°N,125°W. Our proposed unsupervised learning algorithm consisting of two stages. First, an empirical quantile-based identification of dips in NSS enables us to capture most events with hourly averaged rainfall rate > 5 mm/hr. Over-estimation of precipitation duration is then corrected locally by fitting a parametric model based on the salinity balance equation. We propose a local precipitation model composed of a small number of calibration parameters representing individual rainfall events and their location in time. We show that unsupervised rainfall detection can be formulated as a statistical problem of predicting these variables from NSS data. We present our results and provide a validation technique based on data collected at the SPURS-2 mooring.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89674925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-23DOI: 10.1175/aies-d-22-0042.1
B. Scarino, K. Itterly, Kristopher Bedka, C. Homeyer, J. Allen, S. Bang, Daniel J. Cecil
Geostationary satellite imagers provide historical and near-real-time observations of cloud top patterns that are commonly associated with severe convection. Environmental conditions favorable for severe weather are thought to be represented well by reanalyses. Predicting exactly where convection and costly storm hazards like hail will occur using models or satellite imagery alone, however, is extremely challenging. The multivariate combination of satellite-observed cloud patterns with reanalysis environmental parameters, linked to Next Generation Weather Radar- (NEXRAD-) estimated Maximum Expected Size of Hail (MESH) using a deep neural network (DNN), enables estimation of potentially severe hail likelihood for any observed storm cell. These estimates are made where satellites observe cold clouds, indicative of convection, located in favorable storm environments. We seek an approach that can be used to estimate climatological hailstorm frequency and risk throughout the historical satellite data record. Statistical distributions of convective parameters from satellite and reanalysis show separation between non-severe/severe hailstorm classes for predictors including overshooting cloud top temperature and area characteristics, vertical wind shear, and convective inhibition. These complex, multivariate predictor relationships are exploited within a DNN to produce a likelihood estimate with a critical success index of 0.511 and Heidke skill score of 0.407, which is exceptional among analogous hail studies. Furthermore, applications of the DNN to case studies demonstrate good qualitative agreement between hail likelihood and MESH. These hail classifications are aggregated across an 11-year GOES-12/13 image database to derive a hail frequency and severity climatology, which denotes the Central Plains, the Midwest, and northwestern Mexico as being the most hail-prone regions within the domain studied.
{"title":"Deriving Severe Hail Likelihood from Satellite Observations and Model Reanalysis Parameters using a Deep Neural Network","authors":"B. Scarino, K. Itterly, Kristopher Bedka, C. Homeyer, J. Allen, S. Bang, Daniel J. Cecil","doi":"10.1175/aies-d-22-0042.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0042.1","url":null,"abstract":"\u0000Geostationary satellite imagers provide historical and near-real-time observations of cloud top patterns that are commonly associated with severe convection. Environmental conditions favorable for severe weather are thought to be represented well by reanalyses. Predicting exactly where convection and costly storm hazards like hail will occur using models or satellite imagery alone, however, is extremely challenging. The multivariate combination of satellite-observed cloud patterns with reanalysis environmental parameters, linked to Next Generation Weather Radar- (NEXRAD-) estimated Maximum Expected Size of Hail (MESH) using a deep neural network (DNN), enables estimation of potentially severe hail likelihood for any observed storm cell. These estimates are made where satellites observe cold clouds, indicative of convection, located in favorable storm environments. We seek an approach that can be used to estimate climatological hailstorm frequency and risk throughout the historical satellite data record.\u0000Statistical distributions of convective parameters from satellite and reanalysis show separation between non-severe/severe hailstorm classes for predictors including overshooting cloud top temperature and area characteristics, vertical wind shear, and convective inhibition. These complex, multivariate predictor relationships are exploited within a DNN to produce a likelihood estimate with a critical success index of 0.511 and Heidke skill score of 0.407, which is exceptional among analogous hail studies. Furthermore, applications of the DNN to case studies demonstrate good qualitative agreement between hail likelihood and MESH. These hail classifications are aggregated across an 11-year GOES-12/13 image database to derive a hail frequency and severity climatology, which denotes the Central Plains, the Midwest, and northwestern Mexico as being the most hail-prone regions within the domain studied.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84471686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-23DOI: 10.1175/aies-d-22-0080.1
Brian C. Filipiak, N. Bassill, Kristen Corbosiero, A. Lang, Ross A. Lazear
Winter mixed precipitation events are associated with multiple hazards and create forecast challenges due to the difficulty in determining the timing and amount of each precipitation type. In New York State, complex terrain enhances these forecast challenges. Machine learning is a relatively nascent tool that can help improve forecasting by synthesizing large amounts of data and finding underlying relationships. This study uses a random forest machine learning algorithm that generates probabilistic winter precipitation type forecasts. Random forest configuration, testing, and development methods are presented to show how this tool can be applied to operational forecasting. Dataset generation and variation are also explained due to their essential nature in the random forest. Lastly, the methodology of transitioning a machine learning algorithm from research to operations is discussed.
{"title":"Probabilistic Forecasting Methods of Winter Mixed Precipitation Events in New York State Utilizing a Random Forest","authors":"Brian C. Filipiak, N. Bassill, Kristen Corbosiero, A. Lang, Ross A. Lazear","doi":"10.1175/aies-d-22-0080.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0080.1","url":null,"abstract":"\u0000Winter mixed precipitation events are associated with multiple hazards and create forecast challenges due to the difficulty in determining the timing and amount of each precipitation type. In New York State, complex terrain enhances these forecast challenges. Machine learning is a relatively nascent tool that can help improve forecasting by synthesizing large amounts of data and finding underlying relationships. This study uses a random forest machine learning algorithm that generates probabilistic winter precipitation type forecasts. Random forest configuration, testing, and development methods are presented to show how this tool can be applied to operational forecasting. Dataset generation and variation are also explained due to their essential nature in the random forest. Lastly, the methodology of transitioning a machine learning algorithm from research to operations is discussed.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75719090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}