Global atmospheric data assimilation with multi-modal masked autoencoders

Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn
{"title":"Global atmospheric data assimilation with multi-modal masked autoencoders","authors":"Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn","doi":"arxiv-2407.11696","DOIUrl":null,"url":null,"abstract":"Global data assimilation enables weather forecasting at all scales and\nprovides valuable data for studying the Earth system. However, the\ncomputational demands of physics-based algorithms used in operational systems\nlimits the volume and diversity of observations that are assimilated. Here, we\npresent \"EarthNet\", a multi-modal foundation model for data assimilation that\nlearns to predict a global gap-filled atmospheric state solely from satellite\nobservations. EarthNet is trained as a masked autoencoder that ingests a 12\nhour sequence of observations and learns to fill missing data from other\nsensors. We show that EarthNet performs a form of data assimilation producing a\nglobal 0.16 degree reanalysis dataset of 3D atmospheric temperature and\nhumidity at a fraction of the time compared to operational systems. It is shown\nthat the resulting reanalysis dataset reproduces climatology by evaluating a 1\nhour forecast background state against observations. We also show that our 3D\nhumidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60%\nbetween the middle troposphere and lower stratosphere (5 to 20 km altitude) and\nour 3D temperature and humidity are statistically equivalent to the Microwave\nintegrated Retrieval System (MiRS) observations at nearly every level of the\natmosphere. Our results indicate significant promise in using EarthNet for\nhigh-frequency data assimilation and global weather forecasting.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"307 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Atmospheric and Oceanic Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.11696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Global data assimilation enables weather forecasting at all scales and provides valuable data for studying the Earth system. However, the computational demands of physics-based algorithms used in operational systems limits the volume and diversity of observations that are assimilated. Here, we present "EarthNet", a multi-modal foundation model for data assimilation that learns to predict a global gap-filled atmospheric state solely from satellite observations. EarthNet is trained as a masked autoencoder that ingests a 12 hour sequence of observations and learns to fill missing data from other sensors. We show that EarthNet performs a form of data assimilation producing a global 0.16 degree reanalysis dataset of 3D atmospheric temperature and humidity at a fraction of the time compared to operational systems. It is shown that the resulting reanalysis dataset reproduces climatology by evaluating a 1 hour forecast background state against observations. We also show that our 3D humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60% between the middle troposphere and lower stratosphere (5 to 20 km altitude) and our 3D temperature and humidity are statistically equivalent to the Microwave integrated Retrieval System (MiRS) observations at nearly every level of the atmosphere. Our results indicate significant promise in using EarthNet for high-frequency data assimilation and global weather forecasting.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用多模态掩码自动编码器进行全球大气数据同化
全球数据同化使各种尺度的天气预报成为可能,并为研究地球系统提供了宝贵的数据。然而,业务系统中使用的基于物理的算法的计算需求限制了同化观测数据的数量和多样性。在这里,我们将介绍一种用于数据同化的多模式基础模型--"EarthNet",它能够仅通过卫星观测数据预测全球空隙大气状态。EarthNet 被训练成一个遮蔽式自动编码器,它接收 12 小时的观测数据序列,并学习从其他传感器填补缺失数据。我们的研究表明,与业务系统相比,EarthNet 只用了一小部分时间就完成了数据同化,生成了全球 0.16 度的三维大气温度和湿度再分析数据集。通过将 1 小时的预报背景状态与观测数据进行对比评估,结果表明再分析数据集能够再现气候学。我们还表明,我们的三维湿度预测在对流层中层和平流层下层(5 到 20 千米高度)比 MERRA-2 和 ERA5 再分析高出 10%到 60%,而且我们的三维温度和湿度在统计上与微波综合检索系统(MiRS)在大气层几乎每一层的观测数据相当。我们的结果表明,利用地球网进行高频数据同化和全球天气预报大有可为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Harnessing AI data-driven global weather models for climate attribution: An analysis of the 2017 Oroville Dam extreme atmospheric river Super Resolution On Global Weather Forecasts Can Transfer Learning be Used to Identify Tropical State-Dependent Bias Relevant to Midlatitude Subseasonal Predictability? Using Generative Models to Produce Realistic Populations of the United Kingdom Windstorms Integrated nowcasting of convective precipitation with Transformer-based models using multi-source data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1