Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn
{"title":"Global atmospheric data assimilation with multi-modal masked autoencoders","authors":"Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn","doi":"arxiv-2407.11696","DOIUrl":null,"url":null,"abstract":"Global data assimilation enables weather forecasting at all scales and\nprovides valuable data for studying the Earth system. However, the\ncomputational demands of physics-based algorithms used in operational systems\nlimits the volume and diversity of observations that are assimilated. Here, we\npresent \"EarthNet\", a multi-modal foundation model for data assimilation that\nlearns to predict a global gap-filled atmospheric state solely from satellite\nobservations. EarthNet is trained as a masked autoencoder that ingests a 12\nhour sequence of observations and learns to fill missing data from other\nsensors. We show that EarthNet performs a form of data assimilation producing a\nglobal 0.16 degree reanalysis dataset of 3D atmospheric temperature and\nhumidity at a fraction of the time compared to operational systems. It is shown\nthat the resulting reanalysis dataset reproduces climatology by evaluating a 1\nhour forecast background state against observations. We also show that our 3D\nhumidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60%\nbetween the middle troposphere and lower stratosphere (5 to 20 km altitude) and\nour 3D temperature and humidity are statistically equivalent to the Microwave\nintegrated Retrieval System (MiRS) observations at nearly every level of the\natmosphere. Our results indicate significant promise in using EarthNet for\nhigh-frequency data assimilation and global weather forecasting.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"307 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Atmospheric and Oceanic Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.11696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Global data assimilation enables weather forecasting at all scales and
provides valuable data for studying the Earth system. However, the
computational demands of physics-based algorithms used in operational systems
limits the volume and diversity of observations that are assimilated. Here, we
present "EarthNet", a multi-modal foundation model for data assimilation that
learns to predict a global gap-filled atmospheric state solely from satellite
observations. EarthNet is trained as a masked autoencoder that ingests a 12
hour sequence of observations and learns to fill missing data from other
sensors. We show that EarthNet performs a form of data assimilation producing a
global 0.16 degree reanalysis dataset of 3D atmospheric temperature and
humidity at a fraction of the time compared to operational systems. It is shown
that the resulting reanalysis dataset reproduces climatology by evaluating a 1
hour forecast background state against observations. We also show that our 3D
humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60%
between the middle troposphere and lower stratosphere (5 to 20 km altitude) and
our 3D temperature and humidity are statistically equivalent to the Microwave
integrated Retrieval System (MiRS) observations at nearly every level of the
atmosphere. Our results indicate significant promise in using EarthNet for
high-frequency data assimilation and global weather forecasting.