Estimating historical evapotranspiration (ET) is essential for understanding the effects of climate change and human activities on the water cycle. This study used historical weather station data to reconstruct ET trends over the past 300 years with machine learning. A Random Forest model, trained on FLUXNET2015 flux stations' monthly data using precipitation, temperature, aridity index, and rooting depth as predictors, achieved an R2 of 0.66 and a KGE of 0.76 through 10-fold cross-validation. Applied to 5267 weather stations, the model produced monthly ET data showing a general increase in global ET from 1700 to the present, with a notable acceleration after 1900 due to warming. Regional differences were observed, with higher ET increases in mid-to-high latitudes of the Northern Hemisphere and decreases in some mid-to-low latitudes and the Southern Hemisphere. In drylands, ET and temperature were weakly correlated, while in humid areas, the correlation was much higher. The correlation between ET and precipitation has remained stable over the centuries. This study extends the ET data time span, providing valuable insights into long-term historical ET trends and their drivers, aiding in reassessing the impact of historical climate change and human activities on the water cycle and supporting future climate adaptation strategies.
{"title":"Evapotranspiration trends over the last 300 years reconstructed from historical weather station observations via machine learning","authors":"Haiyang Shi","doi":"arxiv-2407.16265","DOIUrl":"https://doi.org/arxiv-2407.16265","url":null,"abstract":"Estimating historical evapotranspiration (ET) is essential for understanding\u0000the effects of climate change and human activities on the water cycle. This\u0000study used historical weather station data to reconstruct ET trends over the\u0000past 300 years with machine learning. A Random Forest model, trained on\u0000FLUXNET2015 flux stations' monthly data using precipitation, temperature,\u0000aridity index, and rooting depth as predictors, achieved an R2 of 0.66 and a\u0000KGE of 0.76 through 10-fold cross-validation. Applied to 5267 weather stations,\u0000the model produced monthly ET data showing a general increase in global ET from\u00001700 to the present, with a notable acceleration after 1900 due to warming.\u0000Regional differences were observed, with higher ET increases in mid-to-high\u0000latitudes of the Northern Hemisphere and decreases in some mid-to-low latitudes\u0000and the Southern Hemisphere. In drylands, ET and temperature were weakly\u0000correlated, while in humid areas, the correlation was much higher. The\u0000correlation between ET and precipitation has remained stable over the\u0000centuries. This study extends the ET data time span, providing valuable\u0000insights into long-term historical ET trends and their drivers, aiding in\u0000reassessing the impact of historical climate change and human activities on the\u0000water cycle and supporting future climate adaptation strategies.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141783403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, Sean Healy
Skilful Machine Learned weather forecasts have challenged our approach to numerical weather prediction, demonstrating competitive performance compared to traditional physics-based approaches. Data-driven systems have been trained to forecast future weather by learning from long historical records of past weather such as the ECMWF ERA5. These datasets have been made freely available to the wider research community, including the commercial sector, which has been a major factor in the rapid rise of ML forecast systems and the levels of accuracy they have achieved. However, historical reanalyses used for training and real-time analyses used for initial conditions are produced by data assimilation, an optimal blending of observations with a physics-based forecast model. As such, many ML forecast systems have an implicit and unquantified dependence on the physics-based models they seek to challenge. Here we propose a new approach, training a neural network to predict future weather purely from historical observations with no dependence on reanalyses. We use raw observations to initialise a model of the atmosphere (in observation space) learned directly from the observations themselves. Forecasts of crucial weather parameters (such as surface temperature and wind) are obtained by predicting weather parameter observations (e.g. SYNOP surface data) at future times and arbitrary locations. We present preliminary results on forecasting observations 12-hours into the future. These already demonstrate successful learning of time evolutions of the physical processes captured in real observations. We argue that this new approach, by staying purely in observation space, avoids many of the challenges of traditional data assimilation, can exploit a wider range of observations and is readily expanded to simultaneous forecasting of the full Earth system (atmosphere, land, ocean and composition).
娴熟的机器学习天气预报对我们的数值天气预报方法提出了挑战,与传统的基于物理的方法相比,它显示出具有竞争力的性能。数据驱动的系统通过学习过去天气的长期历史记录(如 ECMWF ERA5)来预测未来天气。这些数据集已免费提供给包括商业部门在内的更广泛的研究界,这也是 ML 预报系统迅速崛起并达到准确水平的一个重要因素。然而,用于训练的历史再分析和用于初始条件的实时分析都是通过数据同化产生的,是观测数据与基于物理的预报模式的最佳融合。因此,许多 ML 预报系统对它们试图挑战的基于物理的模式有着隐含的、未量化的依赖。在这里,我们提出了一种新方法,即训练一个神经网络,让它完全根据历史观测数据预测未来天气,而不依赖于再分析。我们使用原始观测数据来初始化直接从观测数据中学习的大气模型(观测空间)。通过预测未来时间和任意地点的天气参数观测数据(如 SYNOP 地表数据),可以获得关键天气参数(如地表温度和风)的预报。我们展示了预测未来 12 小时内观测数据的初步结果。这些结果已经证明,我们成功地学习了真实观测数据中捕捉到的物理过程的时间变化。我们认为,这种新方法纯粹停留在观测空间,避免了传统数据同化的许多挑战,可以利用更广泛的观测资料,并可随时扩展到对整个地球系统(大气、陆地、海洋和成分)的同步预报。
{"title":"Data driven weather forecasts trained and initialised directly from observations","authors":"Anthony McNally, Christian Lessig, Peter Lean, Eulalie Boucher, Mihai Alexe, Ewan Pinnington, Matthew Chantry, Simon Lang, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, Sean Healy","doi":"arxiv-2407.15586","DOIUrl":"https://doi.org/arxiv-2407.15586","url":null,"abstract":"Skilful Machine Learned weather forecasts have challenged our approach to\u0000numerical weather prediction, demonstrating competitive performance compared to\u0000traditional physics-based approaches. Data-driven systems have been trained to\u0000forecast future weather by learning from long historical records of past\u0000weather such as the ECMWF ERA5. These datasets have been made freely available\u0000to the wider research community, including the commercial sector, which has\u0000been a major factor in the rapid rise of ML forecast systems and the levels of\u0000accuracy they have achieved. However, historical reanalyses used for training\u0000and real-time analyses used for initial conditions are produced by data\u0000assimilation, an optimal blending of observations with a physics-based forecast\u0000model. As such, many ML forecast systems have an implicit and unquantified\u0000dependence on the physics-based models they seek to challenge. Here we propose\u0000a new approach, training a neural network to predict future weather purely from\u0000historical observations with no dependence on reanalyses. We use raw\u0000observations to initialise a model of the atmosphere (in observation space)\u0000learned directly from the observations themselves. Forecasts of crucial weather\u0000parameters (such as surface temperature and wind) are obtained by predicting\u0000weather parameter observations (e.g. SYNOP surface data) at future times and\u0000arbitrary locations. We present preliminary results on forecasting observations\u000012-hours into the future. These already demonstrate successful learning of time\u0000evolutions of the physical processes captured in real observations. We argue\u0000that this new approach, by staying purely in observation space, avoids many of\u0000the challenges of traditional data assimilation, can exploit a wider range of\u0000observations and is readily expanded to simultaneous forecasting of the full\u0000Earth system (atmosphere, land, ocean and composition).","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"28 4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141783450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henry Addison, Elizabeth Kendon, Suman Ravuri, Laurence Aitchison, Peter AG Watson
High-resolution climate simulations are very valuable for understanding climate change impacts and planning adaptation measures. This has motivated use of regional climate models at sufficiently fine resolution to capture important small-scale atmospheric processes, such as convective storms. However, these regional models have very high computational costs, limiting their applicability. We present CPMGEM, a novel application of a generative machine learning model, a diffusion model, to skilfully emulate precipitation simulations from such a high-resolution model over England and Wales at much lower cost. This emulator enables stochastic generation of high-resolution (8.8km), daily-mean precipitation samples conditioned on coarse-resolution (60km) weather states from a global climate model. The output is fine enough for use in applications such as flood inundation modelling. The emulator produces precipitation predictions with realistic intensities and spatial structures and captures most of the 21st century climate change signal. We show evidence that the emulator has skill for extreme events up to and including 1-in-100 year intensities. Potential applications include producing high-resolution precipitation predictions for large-ensemble climate simulations and downscaling different climate models and climate change scenarios to better sample uncertainty in climate changes at local-scale.
{"title":"Machine learning emulation of precipitation from km-scale regional climate simulations using a diffusion model","authors":"Henry Addison, Elizabeth Kendon, Suman Ravuri, Laurence Aitchison, Peter AG Watson","doi":"arxiv-2407.14158","DOIUrl":"https://doi.org/arxiv-2407.14158","url":null,"abstract":"High-resolution climate simulations are very valuable for understanding\u0000climate change impacts and planning adaptation measures. This has motivated use\u0000of regional climate models at sufficiently fine resolution to capture important\u0000small-scale atmospheric processes, such as convective storms. However, these\u0000regional models have very high computational costs, limiting their\u0000applicability. We present CPMGEM, a novel application of a generative machine\u0000learning model, a diffusion model, to skilfully emulate precipitation\u0000simulations from such a high-resolution model over England and Wales at much\u0000lower cost. This emulator enables stochastic generation of high-resolution\u0000(8.8km), daily-mean precipitation samples conditioned on coarse-resolution\u0000(60km) weather states from a global climate model. The output is fine enough\u0000for use in applications such as flood inundation modelling. The emulator\u0000produces precipitation predictions with realistic intensities and spatial\u0000structures and captures most of the 21st century climate change signal. We show\u0000evidence that the emulator has skill for extreme events up to and including\u00001-in-100 year intensities. Potential applications include producing\u0000high-resolution precipitation predictions for large-ensemble climate\u0000simulations and downscaling different climate models and climate change\u0000scenarios to better sample uncertainty in climate changes at local-scale.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"163 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kuniaki Inoue, Maxwell Kelley, Ann M. Fridlind, Michela Biasutti, Gregory S. Elsaesser
This paper addresses the challenges in computing the column moist static energy (MSE) budget in climate models. Residuals from such computations often match other major budget terms in magnitude, obscuring their contributions. This study introduces a methodology for accurately computing the column MSE budget in climate models, demonstrated using the GISS ModelE3. Multiple factors leading to significant residuals are identified, with the failure of the continuous calculus's chain rule upon discretization being the most critical. This failure causes the potential temperature equation to diverge from the enthalpy equation in discretized models. Consequently, in models using potential temperature as a prognostic variable, the MSE budget equation is fundamentally not upheld, requiring a tailored strategy to close the budget. This study introduces the ``process increment method'' for accurately computing the column MSE flux divergence. This method calculates the difference in the sum of column internal energy, geopotential, and latent heats before and after applying the dynamics scheme. Furthermore, the calculated column flux divergence is decomposed into its advective components. These computations enable precise MSE budget analysis. The most crucial finding is that vertical interpolation into pressure coordinates can introduce errors substantial enough to reverse the sign of vertical MSE advection in the warm pool regions. In ModelE3, accurately computed values show MSE import via vertical circulations, while values in pressure coordinates indicate export. This discrepancy may prompt a reevaluation of vertical advection as an exporting mechanism and underscores the importance of precise MSE budget calculations.
{"title":"Accurate Column Moist Static Energy Budget in Climate Models. Part 1: Conservation Equation Formulation, Methodology, and Primary Results Demonstrated Using GISS ModelE3","authors":"Kuniaki Inoue, Maxwell Kelley, Ann M. Fridlind, Michela Biasutti, Gregory S. Elsaesser","doi":"arxiv-2407.13855","DOIUrl":"https://doi.org/arxiv-2407.13855","url":null,"abstract":"This paper addresses the challenges in computing the column moist static\u0000energy (MSE) budget in climate models. Residuals from such computations often\u0000match other major budget terms in magnitude, obscuring their contributions.\u0000This study introduces a methodology for accurately computing the column MSE\u0000budget in climate models, demonstrated using the GISS ModelE3. Multiple factors\u0000leading to significant residuals are identified, with the failure of the\u0000continuous calculus's chain rule upon discretization being the most critical.\u0000This failure causes the potential temperature equation to diverge from the\u0000enthalpy equation in discretized models. Consequently, in models using\u0000potential temperature as a prognostic variable, the MSE budget equation is\u0000fundamentally not upheld, requiring a tailored strategy to close the budget.\u0000This study introduces the ``process increment method'' for accurately computing\u0000the column MSE flux divergence. This method calculates the difference in the\u0000sum of column internal energy, geopotential, and latent heats before and after\u0000applying the dynamics scheme. Furthermore, the calculated column flux\u0000divergence is decomposed into its advective components. These computations\u0000enable precise MSE budget analysis. The most crucial finding is that vertical\u0000interpolation into pressure coordinates can introduce errors substantial enough\u0000to reverse the sign of vertical MSE advection in the warm pool regions. In\u0000ModelE3, accurately computed values show MSE import via vertical circulations,\u0000while values in pressure coordinates indicate export. This discrepancy may\u0000prompt a reevaluation of vertical advection as an exporting mechanism and\u0000underscores the importance of precise MSE budget calculations.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We determine distributions and correlation properties of offshore wind speeds and wind speed increments by analyzing wind data sampled with a resolution of one second for 20 months at different heights over the sea level in the North Sea. Distributions of horizontal wind speeds can be fitted to Weibull distributions with shape and scale parameters varying weakly with the vertical height separation. Kullback-Leibler divergences between distributions at different heights change with the squared logarithm of the height ratio. Cross-correlations between time derivates of wind speeds are long-term anticorrelated and their correlations functions satisfy sum rules. Distributions of horizontal wind speed increments change from a tent-like shape to a Gaussian with rising increment lag. A surprising peak occurs in the left tail of the increment distributions for lags in a range $10-200,{rm km}$ after applying the Taylor's hypothesis locally to transform time lags into distances. The peak is decisive in order to obtain an expected and observed linear scaling of third-order structure functions with distance. This suggests that it is an intrinsic feature of atmospheric turbulence.
{"title":"Distributions and correlation properties of offshore wind speeds and wind speed increments","authors":"So-Kumneth Sim, Philipp Maass, H. Eduardo Roman","doi":"arxiv-2407.12934","DOIUrl":"https://doi.org/arxiv-2407.12934","url":null,"abstract":"We determine distributions and correlation properties of offshore wind speeds\u0000and wind speed increments by analyzing wind data sampled with a resolution of\u0000one second for 20 months at different heights over the sea level in the North\u0000Sea. Distributions of horizontal wind speeds can be fitted to Weibull\u0000distributions with shape and scale parameters varying weakly with the vertical\u0000height separation. Kullback-Leibler divergences between distributions at\u0000different heights change with the squared logarithm of the height ratio.\u0000Cross-correlations between time derivates of wind speeds are long-term\u0000anticorrelated and their correlations functions satisfy sum rules.\u0000Distributions of horizontal wind speed increments change from a tent-like shape\u0000to a Gaussian with rising increment lag. A surprising peak occurs in the left\u0000tail of the increment distributions for lags in a range $10-200,{rm km}$\u0000after applying the Taylor's hypothesis locally to transform time lags into\u0000distances. The peak is decisive in order to obtain an expected and observed\u0000linear scaling of third-order structure functions with distance. This suggests\u0000that it is an intrinsic feature of atmospheric turbulence.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"33 10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Piyu Ke, Philippe Ciais, Stephen Sitch, Wei Li, Ana Bastos, Zhu Liu, Yidi Xu, Xiaofan Gui, Jiang Bian, Daniel S Goll, Yi Xi, Wanjing Li, Michael O'Sullivan, Jeffeson Goncalves de Souza, Pierre Friedlingstein, Frederic Chevallier
In 2023, the CO2 growth rate was 3.37 +/- 0.11 ppm at Mauna Loa, 86% above the previous year, and hitting a record high since observations began in 1958, while global fossil fuel CO2 emissions only increased by 0.6 +/- 0.5%. This implies an unprecedented weakening of land and ocean sinks, and raises the question of where and why this reduction happened. Here we show a global net land CO2 sink of 0.44 +/- 0.21 GtC yr-1, the weakest since 2003. We used dynamic global vegetation models, satellites fire emissions, an atmospheric inversion based on OCO-2 measurements, and emulators of ocean biogeochemical and data driven models to deliver a fast-track carbon budget in 2023. Those models ensured consistency with previous carbon budgets. Regional flux anomalies from 2015-2022 are consistent between top-down and bottom-up approaches, with the largest abnormal carbon loss in the Amazon during the drought in the second half of 2023 (0.31 +/- 0.19 GtC yr-1), extreme fire emissions of 0.58 +/- 0.10 GtC yr-1 in Canada and a loss in South-East Asia (0.13 +/- 0.12 GtC yr-1). Since 2015, land CO2 uptake north of 20 degree N declined by half to 1.13 +/- 0.24 GtC yr-1 in 2023. Meanwhile, the tropics recovered from the 2015-16 El Nino carbon loss, gained carbon during the La Nina years (2020-2023), then switched to a carbon loss during the 2023 El Nino (0.56 +/- 0.23 GtC yr-1). The ocean sink was stronger than normal in the equatorial eastern Pacific due to reduced upwelling from La Nina's retreat in early 2023 and the development of El Nino later. Land regions exposed to extreme heat in 2023 contributed a gross carbon loss of 1.73 GtC yr-1, indicating that record warming in 2023 had a strong negative impact on the capacity of terrestrial ecosystems to mitigate climate change.
{"title":"Low latency carbon budget analysis reveals a large decline of the land carbon sink in 2023","authors":"Piyu Ke, Philippe Ciais, Stephen Sitch, Wei Li, Ana Bastos, Zhu Liu, Yidi Xu, Xiaofan Gui, Jiang Bian, Daniel S Goll, Yi Xi, Wanjing Li, Michael O'Sullivan, Jeffeson Goncalves de Souza, Pierre Friedlingstein, Frederic Chevallier","doi":"arxiv-2407.12447","DOIUrl":"https://doi.org/arxiv-2407.12447","url":null,"abstract":"In 2023, the CO2 growth rate was 3.37 +/- 0.11 ppm at Mauna Loa, 86% above\u0000the previous year, and hitting a record high since observations began in 1958,\u0000while global fossil fuel CO2 emissions only increased by 0.6 +/- 0.5%. This\u0000implies an unprecedented weakening of land and ocean sinks, and raises the\u0000question of where and why this reduction happened. Here we show a global net\u0000land CO2 sink of 0.44 +/- 0.21 GtC yr-1, the weakest since 2003. We used\u0000dynamic global vegetation models, satellites fire emissions, an atmospheric\u0000inversion based on OCO-2 measurements, and emulators of ocean biogeochemical\u0000and data driven models to deliver a fast-track carbon budget in 2023. Those\u0000models ensured consistency with previous carbon budgets. Regional flux\u0000anomalies from 2015-2022 are consistent between top-down and bottom-up\u0000approaches, with the largest abnormal carbon loss in the Amazon during the\u0000drought in the second half of 2023 (0.31 +/- 0.19 GtC yr-1), extreme fire\u0000emissions of 0.58 +/- 0.10 GtC yr-1 in Canada and a loss in South-East Asia\u0000(0.13 +/- 0.12 GtC yr-1). Since 2015, land CO2 uptake north of 20 degree N\u0000declined by half to 1.13 +/- 0.24 GtC yr-1 in 2023. Meanwhile, the tropics\u0000recovered from the 2015-16 El Nino carbon loss, gained carbon during the La\u0000Nina years (2020-2023), then switched to a carbon loss during the 2023 El Nino\u0000(0.56 +/- 0.23 GtC yr-1). The ocean sink was stronger than normal in the\u0000equatorial eastern Pacific due to reduced upwelling from La Nina's retreat in\u0000early 2023 and the development of El Nino later. Land regions exposed to\u0000extreme heat in 2023 contributed a gross carbon loss of 1.73 GtC yr-1,\u0000indicating that record warming in 2023 had a strong negative impact on the\u0000capacity of terrestrial ecosystems to mitigate climate change.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Puja Das, August Posch, Nathan Barber, Michael Hicks, Thomas J. Vandal, Kate Duffy, Debjani Singh, Katie van Werkhoven, Auroop R. Ganguly
Precipitation nowcasting, critical for flood emergency and river management, has remained challenging for decades, although recent developments in deep generative modeling (DGM) suggest the possibility of improvements. River management centers, such as the Tennessee Valley Authority, have been using Numerical Weather Prediction (NWP) models for nowcasting but have struggled with missed detections even from best-in-class NWP models. While decades of prior research achieved limited improvements beyond advection and localized evolution, recent attempts have shown progress from physics-free machine learning (ML) methods and even greater improvements from physics-embedded ML approaches. Developers of DGM for nowcasting have compared their approaches with optical flow (a variant of advection) and meteorologists' judgment but not with NWP models. Further, they have not conducted independent co-evaluations with water resources and river managers. Here, we show that the state-of-the-art physics-embedded deep generative model, specifically NowcastNet, outperforms the High-Resolution Rapid Refresh (HRRR) model, the latest generation of NWP, along with advection and persistence, especially for heavy precipitation events. For grid-cell extremes over 16 mm/h, NowcastNet demonstrated a median critical success index (CSI) of 0.30, compared with a median CSI of 0.04 for HRRR. However, despite hydrologically relevant improvements in point-by-point forecasts from NowcastNet, caveats include the overestimation of spatially aggregated precipitation over longer lead times. Our co-evaluation with ML developers, hydrologists, and river managers suggests the possibility of improved flood emergency response and hydropower management.
{"title":"Hybrid physics-AI outperforms numerical weather prediction for extreme precipitation nowcasting","authors":"Puja Das, August Posch, Nathan Barber, Michael Hicks, Thomas J. Vandal, Kate Duffy, Debjani Singh, Katie van Werkhoven, Auroop R. Ganguly","doi":"arxiv-2407.11317","DOIUrl":"https://doi.org/arxiv-2407.11317","url":null,"abstract":"Precipitation nowcasting, critical for flood emergency and river management,\u0000has remained challenging for decades, although recent developments in deep\u0000generative modeling (DGM) suggest the possibility of improvements. River\u0000management centers, such as the Tennessee Valley Authority, have been using\u0000Numerical Weather Prediction (NWP) models for nowcasting but have struggled\u0000with missed detections even from best-in-class NWP models. While decades of\u0000prior research achieved limited improvements beyond advection and localized\u0000evolution, recent attempts have shown progress from physics-free machine\u0000learning (ML) methods and even greater improvements from physics-embedded ML\u0000approaches. Developers of DGM for nowcasting have compared their approaches\u0000with optical flow (a variant of advection) and meteorologists' judgment but not\u0000with NWP models. Further, they have not conducted independent co-evaluations\u0000with water resources and river managers. Here, we show that the\u0000state-of-the-art physics-embedded deep generative model, specifically\u0000NowcastNet, outperforms the High-Resolution Rapid Refresh (HRRR) model, the\u0000latest generation of NWP, along with advection and persistence, especially for\u0000heavy precipitation events. For grid-cell extremes over 16 mm/h, NowcastNet\u0000demonstrated a median critical success index (CSI) of 0.30, compared with a\u0000median CSI of 0.04 for HRRR. However, despite hydrologically relevant\u0000improvements in point-by-point forecasts from NowcastNet, caveats include the\u0000overestimation of spatially aggregated precipitation over longer lead times.\u0000Our co-evaluation with ML developers, hydrologists, and river managers suggests\u0000the possibility of improved flood emergency response and hydropower management.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Piotr Mirowski, David Warde-Farley, Mihaela Rosca, Matthew Koichi Grimes, Yana Hasson, Hyunjik Kim, Mélanie Rey, Simon Osindero, Suman Ravuri, Shakir Mohamed
Atmospheric states derived from reanalysis comprise a substantial portion of weather and climate simulation outputs. Many stakeholders -- such as researchers, policy makers, and insurers -- use this data to better understand the earth system and guide policy decisions. Atmospheric states have also received increased interest as machine learning approaches to weather prediction have shown promising results. A key issue for all audiences is that dense time series of these high-dimensional states comprise an enormous amount of data, precluding all but the most well resourced groups from accessing and using historical data and future projections. To address this problem, we propose a method for compressing atmospheric states using methods from the neural network literature, adapting spherical data to processing by conventional neural architectures through the use of the area-preserving HEALPix projection. We investigate two model classes for building neural compressors: the hyperprior model from the neural image compression literature and recent vector-quantised models. We show that both families of models satisfy the desiderata of small average error, a small number of high-error reconstructed pixels, faithful reproduction of extreme events such as hurricanes and heatwaves, preservation of the spectral power distribution across spatial scales. We demonstrate compression ratios in excess of 1000x, with compression and decompression at a rate of approximately one second per global atmospheric state.
{"title":"Neural Compression of Atmospheric States","authors":"Piotr Mirowski, David Warde-Farley, Mihaela Rosca, Matthew Koichi Grimes, Yana Hasson, Hyunjik Kim, Mélanie Rey, Simon Osindero, Suman Ravuri, Shakir Mohamed","doi":"arxiv-2407.11666","DOIUrl":"https://doi.org/arxiv-2407.11666","url":null,"abstract":"Atmospheric states derived from reanalysis comprise a substantial portion of\u0000weather and climate simulation outputs. Many stakeholders -- such as\u0000researchers, policy makers, and insurers -- use this data to better understand\u0000the earth system and guide policy decisions. Atmospheric states have also\u0000received increased interest as machine learning approaches to weather\u0000prediction have shown promising results. A key issue for all audiences is that\u0000dense time series of these high-dimensional states comprise an enormous amount\u0000of data, precluding all but the most well resourced groups from accessing and\u0000using historical data and future projections. To address this problem, we\u0000propose a method for compressing atmospheric states using methods from the\u0000neural network literature, adapting spherical data to processing by\u0000conventional neural architectures through the use of the area-preserving\u0000HEALPix projection. We investigate two model classes for building neural\u0000compressors: the hyperprior model from the neural image compression literature\u0000and recent vector-quantised models. We show that both families of models\u0000satisfy the desiderata of small average error, a small number of high-error\u0000reconstructed pixels, faithful reproduction of extreme events such as\u0000hurricanes and heatwaves, preservation of the spectral power distribution\u0000across spatial scales. We demonstrate compression ratios in excess of 1000x,\u0000with compression and decompression at a rate of approximately one second per\u0000global atmospheric state.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141722318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn
Global data assimilation enables weather forecasting at all scales and provides valuable data for studying the Earth system. However, the computational demands of physics-based algorithms used in operational systems limits the volume and diversity of observations that are assimilated. Here, we present "EarthNet", a multi-modal foundation model for data assimilation that learns to predict a global gap-filled atmospheric state solely from satellite observations. EarthNet is trained as a masked autoencoder that ingests a 12 hour sequence of observations and learns to fill missing data from other sensors. We show that EarthNet performs a form of data assimilation producing a global 0.16 degree reanalysis dataset of 3D atmospheric temperature and humidity at a fraction of the time compared to operational systems. It is shown that the resulting reanalysis dataset reproduces climatology by evaluating a 1 hour forecast background state against observations. We also show that our 3D humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60% between the middle troposphere and lower stratosphere (5 to 20 km altitude) and our 3D temperature and humidity are statistically equivalent to the Microwave integrated Retrieval System (MiRS) observations at nearly every level of the atmosphere. Our results indicate significant promise in using EarthNet for high-frequency data assimilation and global weather forecasting.
{"title":"Global atmospheric data assimilation with multi-modal masked autoencoders","authors":"Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn","doi":"arxiv-2407.11696","DOIUrl":"https://doi.org/arxiv-2407.11696","url":null,"abstract":"Global data assimilation enables weather forecasting at all scales and\u0000provides valuable data for studying the Earth system. However, the\u0000computational demands of physics-based algorithms used in operational systems\u0000limits the volume and diversity of observations that are assimilated. Here, we\u0000present \"EarthNet\", a multi-modal foundation model for data assimilation that\u0000learns to predict a global gap-filled atmospheric state solely from satellite\u0000observations. EarthNet is trained as a masked autoencoder that ingests a 12\u0000hour sequence of observations and learns to fill missing data from other\u0000sensors. We show that EarthNet performs a form of data assimilation producing a\u0000global 0.16 degree reanalysis dataset of 3D atmospheric temperature and\u0000humidity at a fraction of the time compared to operational systems. It is shown\u0000that the resulting reanalysis dataset reproduces climatology by evaluating a 1\u0000hour forecast background state against observations. We also show that our 3D\u0000humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60%\u0000between the middle troposphere and lower stratosphere (5 to 20 km altitude) and\u0000our 3D temperature and humidity are statistically equivalent to the Microwave\u0000integrated Retrieval System (MiRS) observations at nearly every level of the\u0000atmosphere. Our results indicate significant promise in using EarthNet for\u0000high-frequency data assimilation and global weather forecasting.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"307 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Cavalleri, Cristian Lussana, Francesca Viterbo, Michele Brunetti, Riccardo Bonanno, Veronica Manara, Matteo Lacavalla, Simone Sperati, Mario Raffa, Valerio Capecchi, Davide Cesari, Antonio Giordani, Ines Maria Luisa Cerenzia, Maurizio Maugeri
This study focuses on the validation of high-resolution regional reanalyses to understand their effectiveness in reproducing precipitation patterns over Italy, a climate change hotspot characterized by coastal sea-land interaction and complex orography. Nine reanalysis products were evaluated, with the ECMWF global reanalysis ERA5 serving as a benchmark. These included both European (COSMO-REA6, CERRA) and Italy-specific (BOLAM, MERIDA, MERIDA-HRES, MOLOCH, SPHERA, VHR-REA_IT) datasets, using different models and parametrizations. The inter-comparison involved determining the effective resolution of daily precipitation fields using wavelet techniques and assessing intense precipitation statistics through frequency distributions. In-situ observations and observational gridded datasets were used to independently validate reanalysis precipitation fields. The capability of reanalyses to depict daily precipitation patterns was assessed, highlighting a maximum radius of precipitation misplacement of about 15 km, with notably lower skills during summer. An overall overestimation of precipitation was identified in the reanalysis climatological fields over the Po Valley and the Alps, whereas multiple products showed an underestimation of precipitations across the North-West coast, the Apennines, and Southern Italy. Finally, a comparison with a time-consistent observational dataset (UniMi/ISAC-CNR) revealed a non-stable deviation from observations in the annual precipitation cumulate of the reanalysis products analyzed. This should be taken into account when interpreting precipitation trends over Italy.
{"title":"Multi-scale assessment of high-resolution reanalysis precipitation fields over Italy","authors":"Francesco Cavalleri, Cristian Lussana, Francesca Viterbo, Michele Brunetti, Riccardo Bonanno, Veronica Manara, Matteo Lacavalla, Simone Sperati, Mario Raffa, Valerio Capecchi, Davide Cesari, Antonio Giordani, Ines Maria Luisa Cerenzia, Maurizio Maugeri","doi":"arxiv-2407.11517","DOIUrl":"https://doi.org/arxiv-2407.11517","url":null,"abstract":"This study focuses on the validation of high-resolution regional reanalyses\u0000to understand their effectiveness in reproducing precipitation patterns over\u0000Italy, a climate change hotspot characterized by coastal sea-land interaction\u0000and complex orography. Nine reanalysis products were evaluated, with the ECMWF\u0000global reanalysis ERA5 serving as a benchmark. These included both European\u0000(COSMO-REA6, CERRA) and Italy-specific (BOLAM, MERIDA, MERIDA-HRES, MOLOCH,\u0000SPHERA, VHR-REA_IT) datasets, using different models and parametrizations. The\u0000inter-comparison involved determining the effective resolution of daily\u0000precipitation fields using wavelet techniques and assessing intense\u0000precipitation statistics through frequency distributions. In-situ observations\u0000and observational gridded datasets were used to independently validate\u0000reanalysis precipitation fields. The capability of reanalyses to depict daily\u0000precipitation patterns was assessed, highlighting a maximum radius of\u0000precipitation misplacement of about 15 km, with notably lower skills during\u0000summer. An overall overestimation of precipitation was identified in the\u0000reanalysis climatological fields over the Po Valley and the Alps, whereas\u0000multiple products showed an underestimation of precipitations across the\u0000North-West coast, the Apennines, and Southern Italy. Finally, a comparison with\u0000a time-consistent observational dataset (UniMi/ISAC-CNR) revealed a non-stable\u0000deviation from observations in the annual precipitation cumulate of the\u0000reanalysis products analyzed. This should be taken into account when\u0000interpreting precipitation trends over Italy.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"333 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141722316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}