{"title":"LightWeather: Harnessing Absolute Positional Encoding to Efficient and Scalable Global Weather Forecasting","authors":"Yisong Fu, Fei Wang, Zezhi Shao, Chengqing Yu, Yujie Li, Zhao Chen, Zhulin An, Yongjun Xu","doi":"arxiv-2408.09695","DOIUrl":null,"url":null,"abstract":"Recently, Transformers have gained traction in weather forecasting for their\ncapability to capture long-term spatial-temporal correlations. However, their\ncomplex architectures result in large parameter counts and extended training\ntimes, limiting their practical application and scalability to global-scale\nforecasting. This paper aims to explore the key factor for accurate weather\nforecasting and design more efficient solutions. Interestingly, our empirical\nfindings reveal that absolute positional encoding is what really works in\nTransformer-based weather forecasting models, which can explicitly model the\nspatial-temporal correlations even without attention mechanisms. We\ntheoretically prove that its effectiveness stems from the integration of\ngeographical coordinates and real-world time features, which are intrinsically\nrelated to the dynamics of weather. Based on this, we propose LightWeather, a\nlightweight and effective model for station-based global weather forecasting.\nWe employ absolute positional encoding and a simple MLP in place of other\ncomponents of Transformer. With under 30k parameters and less than one hour of\ntraining time, LightWeather achieves state-of-the-art performance on global\nweather datasets compared to other advanced DL methods. The results underscore\nthe superiority of integrating spatial-temporal knowledge over complex\narchitectures, providing novel insights for DL in weather forecasting.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Atmospheric and Oceanic Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.09695","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, Transformers have gained traction in weather forecasting for their
capability to capture long-term spatial-temporal correlations. However, their
complex architectures result in large parameter counts and extended training
times, limiting their practical application and scalability to global-scale
forecasting. This paper aims to explore the key factor for accurate weather
forecasting and design more efficient solutions. Interestingly, our empirical
findings reveal that absolute positional encoding is what really works in
Transformer-based weather forecasting models, which can explicitly model the
spatial-temporal correlations even without attention mechanisms. We
theoretically prove that its effectiveness stems from the integration of
geographical coordinates and real-world time features, which are intrinsically
related to the dynamics of weather. Based on this, we propose LightWeather, a
lightweight and effective model for station-based global weather forecasting.
We employ absolute positional encoding and a simple MLP in place of other
components of Transformer. With under 30k parameters and less than one hour of
training time, LightWeather achieves state-of-the-art performance on global
weather datasets compared to other advanced DL methods. The results underscore
the superiority of integrating spatial-temporal knowledge over complex
architectures, providing novel insights for DL in weather forecasting.