Haoyi Zhou , Jianxin Li , Shanghang Zhang , Shuai Zhang , Mengyi Yan , Hui Xiong
{"title":"Expanding the prediction capacity in long sequence time-series forecasting","authors":"Haoyi Zhou , Jianxin Li , Shanghang Zhang , Shuai Zhang , Mengyi Yan , Hui Xiong","doi":"10.1016/j.artint.2023.103886","DOIUrl":null,"url":null,"abstract":"<div><p><span>Many real-world applications show growing demand for the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) requires a higher prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to accommodate the capacity requirements. However, three real challenges that may have prevented expanding the prediction capacity in LSTF are that the Transformer is limited by quadratic time complexity, high memory usage, and slow inference speed under the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics. (i) a </span><em>ProbSparse</em> self-attention mechanism, which achieves <span><math><mi>O</mi><mo>(</mo><mi>L</mi><mi>log</mi><mo></mo><mi>L</mi><mo>)</mo></math></span> in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling promotes dominating attention by convolutional operators. Besides, the halving of layer width is intended to reduce the expense of building a deeper network on extremely long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on ten large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"318 ","pages":"Article 103886"},"PeriodicalIF":5.1000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370223000322","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 4
Abstract
Many real-world applications show growing demand for the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) requires a higher prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to accommodate the capacity requirements. However, three real challenges that may have prevented expanding the prediction capacity in LSTF are that the Transformer is limited by quadratic time complexity, high memory usage, and slow inference speed under the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics. (i) a ProbSparse self-attention mechanism, which achieves in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling promotes dominating attention by convolutional operators. Besides, the halving of layer width is intended to reduce the expense of building a deeper network on extremely long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on ten large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.
期刊介绍:
The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.