使用 K-均值聚类和通过密集-稀疏-密集策略增强的 LSTM 模型预测原油价格

IF 6.4 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Journal of Big Data Pub Date : 2024-08-17 DOI:10.1186/s40537-024-00977-8

Alireza Jahandoost, Farhad Abedinzadeh Torghabeh, Seyyed Abed Hosseini, Mahboobeh Houshmand

{"title":"使用 K-均值聚类和通过密集-稀疏-密集策略增强的 LSTM 模型预测原油价格","authors":"Alireza Jahandoost, Farhad Abedinzadeh Torghabeh, Seyyed Abed Hosseini, Mahboobeh Houshmand","doi":"10.1186/s40537-024-00977-8","DOIUrl":null,"url":null,"abstract":"<p>Crude oil is an essential energy source that affects international trade, transportation, and manufacturing, highlighting its importance to the economy. Its future price prediction affects consumer prices and the energy markets, and it shapes the development of sustainable energy. It is essential for financial planning, economic stability, and investment decisions. However, reaching a reliable future prediction is an open issue because of its high volatility. Furthermore, many state-of-the-art methods utilize signal decomposition techniques, which can lead to increased prediction time. In this paper, a model called K-means-dense-sparse-dense long short-term memory (K-means-DSD-LSTM) is proposed, which has three main training phrases for crude oil price forecasting. In the first phase, the DSD-LSTM model is trained. Afterwards, the training part of the data is clustered using the K-means algorithm. Finally, a copy of the trained DSD-LSTM model is fine-tuned for each obtained cluster. It helps the models predict that cluster better while they are generalizing the whole dataset quite well, which diminishes overfitting. The proposed model is evaluated on two famous crude oil benchmarks: West Texas Intermediate (WTI) and Brent. Empirical evaluations demonstrated the superiority of the DSD-LSTM model over the K-means-LSTM model. Furthermore, the K-means-DSD-LSTM model exhibited even stronger performance. Notably, the proposed method yielded promising results across diverse datasets, achieving competitive performance in comparison to existing methods, even without employing signal decomposition techniques.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"5 1","pages":""},"PeriodicalIF":6.4000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Crude oil price forecasting using K-means clustering and LSTM model enhanced by dense-sparse-dense strategy\",\"authors\":\"Alireza Jahandoost, Farhad Abedinzadeh Torghabeh, Seyyed Abed Hosseini, Mahboobeh Houshmand\",\"doi\":\"10.1186/s40537-024-00977-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Crude oil is an essential energy source that affects international trade, transportation, and manufacturing, highlighting its importance to the economy. Its future price prediction affects consumer prices and the energy markets, and it shapes the development of sustainable energy. It is essential for financial planning, economic stability, and investment decisions. However, reaching a reliable future prediction is an open issue because of its high volatility. Furthermore, many state-of-the-art methods utilize signal decomposition techniques, which can lead to increased prediction time. In this paper, a model called K-means-dense-sparse-dense long short-term memory (K-means-DSD-LSTM) is proposed, which has three main training phrases for crude oil price forecasting. In the first phase, the DSD-LSTM model is trained. Afterwards, the training part of the data is clustered using the K-means algorithm. Finally, a copy of the trained DSD-LSTM model is fine-tuned for each obtained cluster. It helps the models predict that cluster better while they are generalizing the whole dataset quite well, which diminishes overfitting. The proposed model is evaluated on two famous crude oil benchmarks: West Texas Intermediate (WTI) and Brent. Empirical evaluations demonstrated the superiority of the DSD-LSTM model over the K-means-LSTM model. Furthermore, the K-means-DSD-LSTM model exhibited even stronger performance. Notably, the proposed method yielded promising results across diverse datasets, achieving competitive performance in comparison to existing methods, even without employing signal decomposition techniques.</p>\",\"PeriodicalId\":15158,\"journal\":{\"name\":\"Journal of Big Data\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s40537-024-00977-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-024-00977-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

原油是影响国际贸易、运输和制造业的重要能源，对经济的重要性不言而喻。它对未来价格的预测影响着消费价格和能源市场，并左右着可持续能源的发展。它对财务规划、经济稳定和投资决策至关重要。然而，由于其高度波动性，实现可靠的未来预测是一个尚未解决的问题。此外，许多最先进的方法都采用了信号分解技术，这会导致预测时间的增加。本文提出了一种名为 K-means-dense-sparse-dense long short-term memory（K-means-DSD-LSTM）的模型，该模型有三个主要训练阶段，用于原油价格预测。在第一阶段，对 DSD-LSTM 模型进行训练。然后，使用 K-means 算法对数据的训练部分进行聚类。最后，针对每个获得的聚类对训练好的 DSD-LSTM 模型的副本进行微调。这有助于模型在很好地泛化整个数据集的同时，更好地预测该聚类，从而减少过拟合。我们在两个著名的原油基准上对所提出的模型进行了评估：西德克萨斯中质原油（WTI）和布伦特原油。经验评估表明，DSD-LSTM 模型优于 K-means-LSTM 模型。此外，K-means-DSD-LSTM 模型表现出更强的性能。值得注意的是，所提出的方法在各种数据集上都取得了可喜的成果，与现有方法相比，即使不采用信号分解技术，也能取得具有竞争力的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Crude oil price forecasting using K-means clustering and LSTM model enhanced by dense-sparse-dense strategy

Crude oil is an essential energy source that affects international trade, transportation, and manufacturing, highlighting its importance to the economy. Its future price prediction affects consumer prices and the energy markets, and it shapes the development of sustainable energy. It is essential for financial planning, economic stability, and investment decisions. However, reaching a reliable future prediction is an open issue because of its high volatility. Furthermore, many state-of-the-art methods utilize signal decomposition techniques, which can lead to increased prediction time. In this paper, a model called K-means-dense-sparse-dense long short-term memory (K-means-DSD-LSTM) is proposed, which has three main training phrases for crude oil price forecasting. In the first phase, the DSD-LSTM model is trained. Afterwards, the training part of the data is clustered using the K-means algorithm. Finally, a copy of the trained DSD-LSTM model is fine-tuned for each obtained cluster. It helps the models predict that cluster better while they are generalizing the whole dataset quite well, which diminishes overfitting. The proposed model is evaluated on two famous crude oil benchmarks: West Texas Intermediate (WTI) and Brent. Empirical evaluations demonstrated the superiority of the DSD-LSTM model over the K-means-LSTM model. Furthermore, the K-means-DSD-LSTM model exhibited even stronger performance. Notably, the proposed method yielded promising results across diverse datasets, achieving competitive performance in comparison to existing methods, even without employing signal decomposition techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Big Data Computer Science-Information Systems

CiteScore

17.80

自引率

3.70%

发文量

105

审稿时长

13 weeks

期刊介绍： The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.