Forecasting最新文献_第2页

Riding into Danger: Predictive Modeling for ATV-Related Injuries and Seasonal Patterns 骑入险境：全地形车相关伤害和季节模式的预测模型

Forecasting

Pub Date : 2024-04-02 DOI: 10.3390/forecast6020015

Fernando Ferreira Lima dos Santos, Farzaneh Khorsandi

All-Terrain Vehicles (ATVs) are popular off-road vehicles in the United States, with a staggering 10.5 million households reported to own at least one ATV. Despite their popularity, ATVs pose a significant risk of severe injuries, leading to substantial healthcare expenses and raising public health concerns. As such, gaining insights into the patterns of ATV-related hospitalizations and accurately predicting these injuries is of paramount importance. This knowledge can guide the development of effective prevention strategies, ultimately mitigating ATV-related injuries and the associated healthcare costs. Therefore, we performed an in-depth analysis of ATV-related hospitalizations from 2010 to 2021. Furthermore, we developed and assessed the performance of three forecasting models—Neural Prophet, SARIMA, and LSTM—to predict ATV-related injuries. The performance of these models was evaluated using the Root Mean Square Error (RMSE) accuracy metric. As a result, the LSTM model outperformed the others and could be used to provide valuable insights that can aid in strategic planning and resource allocation within healthcare systems. In addition, our findings highlight the urgent need for prevention programs that are specifically targeted toward youth and timed for the summer season.

全地形车（ATV）是美国非常流行的越野车，据报道，有 1,050 万个家庭至少拥有一辆全地形车。尽管全地形车很受欢迎，但它也存在严重受伤的巨大风险，导致大量医疗费用支出，并引发公众健康问题。因此，深入了解与全地形车相关的住院治疗模式并准确预测这些伤害至关重要。这些知识可以指导制定有效的预防策略，最终减少与全地形车相关的伤害和相关的医疗费用。因此，我们对 2010 年至 2021 年与全地形车相关的住院情况进行了深入分析。此外，我们还开发并评估了神经先知、SARIMA 和 LSTM 三种预测模型的性能，以预测与全地形车相关的伤害。我们使用均方根误差 (RMSE) 精确度来评估这些模型的性能。结果显示，LSTM 模型的表现优于其他模型，可用于提供有价值的见解，帮助医疗保健系统内的战略规划和资源分配。此外，我们的研究结果还突显出，迫切需要专门针对青少年并在夏季开展的预防计划。

{"title":"Riding into Danger: Predictive Modeling for ATV-Related Injuries and Seasonal Patterns","authors":"Fernando Ferreira Lima dos Santos, Farzaneh Khorsandi","doi":"10.3390/forecast6020015","DOIUrl":"https://doi.org/10.3390/forecast6020015","url":null,"abstract":"All-Terrain Vehicles (ATVs) are popular off-road vehicles in the United States, with a staggering 10.5 million households reported to own at least one ATV. Despite their popularity, ATVs pose a significant risk of severe injuries, leading to substantial healthcare expenses and raising public health concerns. As such, gaining insights into the patterns of ATV-related hospitalizations and accurately predicting these injuries is of paramount importance. This knowledge can guide the development of effective prevention strategies, ultimately mitigating ATV-related injuries and the associated healthcare costs. Therefore, we performed an in-depth analysis of ATV-related hospitalizations from 2010 to 2021. Furthermore, we developed and assessed the performance of three forecasting models—Neural Prophet, SARIMA, and LSTM—to predict ATV-related injuries. The performance of these models was evaluated using the Root Mean Square Error (RMSE) accuracy metric. As a result, the LSTM model outperformed the others and could be used to provide valuable insights that can aid in strategic planning and resource allocation within healthcare systems. In addition, our findings highlight the urgent need for prevention programs that are specifically targeted toward youth and timed for the summer season.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"92 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140752634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predictive Maintenance Framework for Fault Detection in Remote Terminal Units 远程终端设备故障检测的预测性维护框架

Forecasting

Pub Date : 2024-03-25 DOI: 10.3390/forecast6020014

A. Lekidis, Angelos Georgakis, Christos Dalamagkas, Elpiniki I. Papageorgiou

The scheduled maintenance of industrial equipment is usually performed with a low frequency, as it usually leads to unpredicted downtime in business operations. Nevertheless, this confers a risk of failure in individual modules of the equipment, which may diminish its performance or even lead to its breakdown, rendering it non-operational. Lately, predictive maintenance methods have been considered for industrial systems, such as power generation stations, as a proactive measure for preventing failures. Such methods use data gathered from industrial equipment and Machine Learning (ML) algorithms to identify data patterns that indicate anomalies and may lead to potential failures. However, industrial equipment exhibits specific behavior and interactions that originate from its configuration from the manufacturer and the system that is installed, which constitutes a great challenge for the effectiveness of ML model maintenance and failure predictions. In this article, we propose a novel method for tackling this challenge based on the development of a digital twin for industrial equipment known as a Remote Terminal Unit (RTU). RTUs are used in electrical systems to provide the remote monitoring and control of critical equipment, such as power generators. The method is applied in an RTU that is connected to a real power generator within a Public Power Corporation (PPC) facility, where operational anomalies are forecasted based on measurements of its processing power, operating temperature, voltage, and storage memory.

对工业设备进行定期维护的频率通常很低，因为这通常会导致无法预料的业务停机。然而，这也带来了设备单个模块发生故障的风险，可能会降低设备性能，甚至导致设备故障，使其无法运行。最近，人们开始考虑在发电站等工业系统中采用预测性维护方法，作为预防故障的积极措施。此类方法使用从工业设备收集的数据和机器学习 (ML) 算法来识别数据模式，这些模式表明存在异常情况，并可能导致潜在故障。然而，工业设备表现出特定的行为和交互，这些行为和交互源于制造商和所安装系统的配置，这对 ML 模型维护和故障预测的有效性构成了巨大挑战。在本文中，我们提出了一种新方法来应对这一挑战，该方法基于被称为远程终端设备（RTU）的工业设备数字孪生系统的开发。RTU 用于电力系统，对发电机等关键设备进行远程监控。该方法应用于连接到公共电力公司（PPC）设施内实际发电设备的 RTU，根据对其处理能力、工作温度、电压和存储记忆的测量，预测运行异常情况。

{"title":"Predictive Maintenance Framework for Fault Detection in Remote Terminal Units","authors":"A. Lekidis, Angelos Georgakis, Christos Dalamagkas, Elpiniki I. Papageorgiou","doi":"10.3390/forecast6020014","DOIUrl":"https://doi.org/10.3390/forecast6020014","url":null,"abstract":"The scheduled maintenance of industrial equipment is usually performed with a low frequency, as it usually leads to unpredicted downtime in business operations. Nevertheless, this confers a risk of failure in individual modules of the equipment, which may diminish its performance or even lead to its breakdown, rendering it non-operational. Lately, predictive maintenance methods have been considered for industrial systems, such as power generation stations, as a proactive measure for preventing failures. Such methods use data gathered from industrial equipment and Machine Learning (ML) algorithms to identify data patterns that indicate anomalies and may lead to potential failures. However, industrial equipment exhibits specific behavior and interactions that originate from its configuration from the manufacturer and the system that is installed, which constitutes a great challenge for the effectiveness of ML model maintenance and failure predictions. In this article, we propose a novel method for tackling this challenge based on the development of a digital twin for industrial equipment known as a Remote Terminal Unit (RTU). RTUs are used in electrical systems to provide the remote monitoring and control of critical equipment, such as power generators. The method is applied in an RTU that is connected to a real power generator within a Public Power Corporation (PPC) facility, where operational anomalies are forecasted based on measurements of its processing power, operating temperature, voltage, and storage memory.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":" September","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140383494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints 从主诉早期预警痛风发作的有效自然语言处理算法

Forecasting

Pub Date : 2024-03-10 DOI: 10.3390/forecast6010013

Lucas Lopes Oliveira, Xiaorui Jiang, Aryalakshmi Nellippillipathil Babu, Poonam Karajagi, Alireza Daneshkhah

Early identification of acute gout is crucial, enabling healthcare professionals to implement targeted interventions for rapid pain relief and preventing disease progression, ensuring improved long-term joint function. In this study, we comprehensively explored the potential early detection of gout flares (GFs) based on nurses’ chief complaint notes in the Emergency Department (ED). Addressing the challenge of identifying GFs prospectively during an ED visit, where documentation is typically minimal, our research focused on employing alternative Natural Language Processing (NLP) techniques to enhance detection accuracy. We investigated GF detection algorithms using both sparse representations by traditional NLP methods and dense encodings by medical domain-specific Large Language Models (LLMs), distinguishing between generative and discriminative models. Three methods were used to alleviate the issue of severe data imbalances, including oversampling, class weights, and focal loss. Extensive empirical studies were performed on the Gout Emergency Department Chief Complaint Corpora. Sparse text representations like tf-idf proved to produce strong performances, achieving F1 scores higher than 0.75. The best deep learning models were RoBERTa-large-PM-M3-Voc and BioGPT, which had the best F1 scores for each dataset, with a 0.8 on the 2019 dataset and a 0.85 F1 score on the 2020 dataset, respectively. We concluded that although discriminative LLMs performed better for this classification task when compared to generative LLMs, a combination of using generative models as feature extractors and employing a support vector machine for classification yielded promising results comparable to those obtained with discriminative models.

急性痛风的早期识别至关重要，可使医护人员采取有针对性的干预措施，迅速缓解疼痛并预防疾病进展，从而确保改善长期关节功能。在这项研究中，我们根据急诊科（ED）护士的主诉记录，全面探讨了早期发现痛风发作（GFs）的可能性。在急诊科就诊过程中，由于记录通常很少，因此我们的研究重点是采用其他自然语言处理（NLP）技术来提高检测的准确性，以应对前瞻性地识别痛风发作的挑战。我们研究了使用传统 NLP 方法的稀疏表示法和使用特定医学领域大语言模型 (LLM) 的密集编码法的 GF 检测算法，并对生成模型和判别模型进行了区分。我们采用了三种方法来缓解严重的数据不平衡问题，包括超采样、类权重和焦点丢失。在痛风急诊科主诉语料库中进行了广泛的实证研究。事实证明，tf-idf 等稀疏文本表示法表现出色，F1 分数高于 0.75。最好的深度学习模型是 RoBERTa-large-PM-M3-Voc 和 BioGPT，它们在每个数据集上都有最好的 F1 分数，在 2019 年数据集上的 F1 分数分别为 0.8 和 0.85。我们得出的结论是，虽然与生成式 LLM 相比，判别式 LLM 在这项分类任务中表现更好，但将生成式模型作为特征提取器并采用支持向量机进行分类的组合产生了与判别式模型相当的可喜结果。

{"title":"Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints","authors":"Lucas Lopes Oliveira, Xiaorui Jiang, Aryalakshmi Nellippillipathil Babu, Poonam Karajagi, Alireza Daneshkhah","doi":"10.3390/forecast6010013","DOIUrl":"https://doi.org/10.3390/forecast6010013","url":null,"abstract":"Early identification of acute gout is crucial, enabling healthcare professionals to implement targeted interventions for rapid pain relief and preventing disease progression, ensuring improved long-term joint function. In this study, we comprehensively explored the potential early detection of gout flares (GFs) based on nurses’ chief complaint notes in the Emergency Department (ED). Addressing the challenge of identifying GFs prospectively during an ED visit, where documentation is typically minimal, our research focused on employing alternative Natural Language Processing (NLP) techniques to enhance detection accuracy. We investigated GF detection algorithms using both sparse representations by traditional NLP methods and dense encodings by medical domain-specific Large Language Models (LLMs), distinguishing between generative and discriminative models. Three methods were used to alleviate the issue of severe data imbalances, including oversampling, class weights, and focal loss. Extensive empirical studies were performed on the Gout Emergency Department Chief Complaint Corpora. Sparse text representations like tf-idf proved to produce strong performances, achieving F1 scores higher than 0.75. The best deep learning models were RoBERTa-large-PM-M3-Voc and BioGPT, which had the best F1 scores for each dataset, with a 0.8 on the 2019 dataset and a 0.85 F1 score on the 2020 dataset, respectively. We concluded that although discriminative LLMs performed better for this classification task when compared to generative LLMs, a combination of using generative models as feature extractors and employing a support vector machine for classification yielded promising results comparable to those obtained with discriminative models.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"52 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140254685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Applying Machine Learning and Statistical Forecasting Methods for Enhancing Pharmaceutical Sales Predictions 应用机器学习和统计预测方法加强药品销售预测

Forecasting

Pub Date : 2024-02-16 DOI: 10.3390/forecast6010010

K. P. Fourkiotis, Athanasios Tsadiras

In today’s evolving global world, the pharmaceutical sector faces an emerging challenge, which is the rapid surge of the global population and the consequent growth in drug production demands. Recognizing this, our study explores the urgent need to strengthen pharmaceutical production capacities, ensuring drugs are allocated and stored strategically to meet diverse regional and demographic needs. Summarizing our key findings, our research focuses on the promising area of drug demand forecasting using artificial intelligence (AI) and machine learning (ML) techniques to enhance predictions in the pharmaceutical field. Supplied with a rich dataset from Kaggle spanning 600,000 sales records from a singular pharmacy, our study embarks on a thorough exploration of univariate time series analysis. Here, we pair conventional analytical tools such as ARIMA with advanced methodologies like LSTM neural networks, all with a singular vision: refining the precision of our sales. Venturing deeper, our data underwent categorisation and were segmented into eight clusters premised on the ATC Anatomical Therapeutic Chemical (ATC) Classification System framework. This segmentation unravels the evident influence of seasonality on drug sales. The analysis not only highlights the effectiveness of machine learning models but also illuminates the remarkable success of XGBoost. This algorithm outperformed traditional models, achieving the lowest MAPE values: 17.89% for M01AB (anti-inflammatory and antirheumatic products, non-steroids, acetic acid derivatives, and related substances), 16.92% for M01AE (anti-inflammatory and antirheumatic products, non-steroids, and propionic acid derivatives), 17.98% for N02BA (analgesics, antipyretics, and anilides), and 16.05% for N02BE (analgesics, antipyretics, pyrazolones, and anilides). XGBoost further demonstrated exceptional precision with the lowest MSE scores: 28.8 for M01AB, 1518.56 for N02BE, and 350.84 for N05C (hypnotics and sedatives). Additionally, the Seasonal Naïve model recorded an MSE of 49.19 for M01AE, while the Single Exponential Smoothing model showed an MSE of 7.19 for N05B. These findings underscore the strengths derived from employing a diverse range of approaches within the forecasting series. In summary, our research accentuates the significance of leveraging machine learning techniques to derive valuable insights for pharmaceutical companies. By applying the power of these methods, companies can optimize their production, storage, distribution, and marketing practices.

在当今不断发展的全球世界中，制药行业面临着一个新出现的挑战，即全球人口迅速激增以及随之而来的药品生产需求增长。认识到这一点，我们的研究探讨了加强药品生产能力的迫切需要，确保药品得到战略性分配和储存，以满足不同地区和人口的需求。总结我们的主要发现，我们的研究重点是利用人工智能（AI）和机器学习（ML）技术来加强制药领域的预测，从而预测药品需求这一前景广阔的领域。我们的研究从 Kaggle 上获得了一个丰富的数据集，该数据集涵盖了一家单一药店的 600,000 条销售记录，我们开始对单变量时间序列分析进行深入探索。在这里，我们将 ARIMA 等传统分析工具与 LSTM 神经网络等先进方法相结合，目的只有一个：提高销售的精确度。深入研究后，我们对数据进行了分类，并根据 ATC 解剖学治疗化学（ATC）分类系统框架将数据划分为八个群组。这种分类揭示了季节性对药品销售的明显影响。该分析不仅凸显了机器学习模型的有效性，还揭示了 XGBoost 的显著成功。该算法优于传统模型，实现了最低的 MAPE 值：M01AB（消炎和抗风湿产品、非类固醇、醋酸衍生物和相关物质）为 17.89%，M01AE（消炎和抗风湿产品、非类固醇和丙酸衍生物）为 16.92%，N02BA（镇痛药、解热药和苯胺类药物）为 17.98%，N02BE（镇痛药、解热药、吡唑酮类药物和苯胺类药物）为 16.05%。XGBoost 进一步证明了其卓越的精确性，其 MSE 分数最低：M01AB 为 28.8，N02BE 为 1518.56，N05C（催眠药和镇静剂）为 350.84。此外，Seasonal Naïve 模型在 M01AE 中的 MSE 为 49.19，而单一指数平滑模型在 N05B 中的 MSE 为 7.19。这些发现强调了在预测系列中采用多种方法的优势。总之，我们的研究强调了利用机器学习技术为制药公司提供有价值见解的重要性。通过应用这些方法的力量，公司可以优化其生产、存储、分销和营销实践。

{"title":"Applying Machine Learning and Statistical Forecasting Methods for Enhancing Pharmaceutical Sales Predictions","authors":"K. P. Fourkiotis, Athanasios Tsadiras","doi":"10.3390/forecast6010010","DOIUrl":"https://doi.org/10.3390/forecast6010010","url":null,"abstract":"In today’s evolving global world, the pharmaceutical sector faces an emerging challenge, which is the rapid surge of the global population and the consequent growth in drug production demands. Recognizing this, our study explores the urgent need to strengthen pharmaceutical production capacities, ensuring drugs are allocated and stored strategically to meet diverse regional and demographic needs. Summarizing our key findings, our research focuses on the promising area of drug demand forecasting using artificial intelligence (AI) and machine learning (ML) techniques to enhance predictions in the pharmaceutical field. Supplied with a rich dataset from Kaggle spanning 600,000 sales records from a singular pharmacy, our study embarks on a thorough exploration of univariate time series analysis. Here, we pair conventional analytical tools such as ARIMA with advanced methodologies like LSTM neural networks, all with a singular vision: refining the precision of our sales. Venturing deeper, our data underwent categorisation and were segmented into eight clusters premised on the ATC Anatomical Therapeutic Chemical (ATC) Classification System framework. This segmentation unravels the evident influence of seasonality on drug sales. The analysis not only highlights the effectiveness of machine learning models but also illuminates the remarkable success of XGBoost. This algorithm outperformed traditional models, achieving the lowest MAPE values: 17.89% for M01AB (anti-inflammatory and antirheumatic products, non-steroids, acetic acid derivatives, and related substances), 16.92% for M01AE (anti-inflammatory and antirheumatic products, non-steroids, and propionic acid derivatives), 17.98% for N02BA (analgesics, antipyretics, and anilides), and 16.05% for N02BE (analgesics, antipyretics, pyrazolones, and anilides). XGBoost further demonstrated exceptional precision with the lowest MSE scores: 28.8 for M01AB, 1518.56 for N02BE, and 350.84 for N05C (hypnotics and sedatives). Additionally, the Seasonal Naïve model recorded an MSE of 49.19 for M01AE, while the Single Exponential Smoothing model showed an MSE of 7.19 for N05B. These findings underscore the strengths derived from employing a diverse range of approaches within the forecasting series. In summary, our research accentuates the significance of leveraging machine learning techniques to derive valuable insights for pharmaceutical companies. By applying the power of these methods, companies can optimize their production, storage, distribution, and marketing practices.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"687 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export 基于奇异谱分析矢量的结构性断裂建模的状态依赖模型：印度尼西亚出口预测

Forecasting

Pub Date : 2024-02-12 DOI: 10.3390/forecast6010009

Yoga Sasmita, Heri Kuswanto, D. Prastyo

Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.

标准的时间序列建模要求模型参数随时间变化保持稳定。模型参数的不稳定性往往由结构断裂引起，从而形成非线性模型。状态依赖模型（SDM）是非线性建模中一种更通用、更灵活的方案。另一方面，时间序列数据通常表现出多种频率成分，如趋势、季节性、周期和噪声。在预测过程中，可以使用奇异频谱分析法（SSA）对这些频率成分进行优化。此外，SSA 中使用最广泛的两种方法是线性递归公式（SSAR）和矢量（SSAV）。与 SSAR 相比，SSAV 具有更好的准确性和鲁棒性，尤其是在处理结构断裂时。因此，本研究提出用 SDM 方法对 SSAV 系数建模，以处理结构断点，称为 SDM-SSAV。SDM 使用扩展卡尔曼滤波器（EKF）递归更新 SSAV 系数，以适应不同时间和不同状态。印尼出口数据的经验结果和模拟研究表明，SDM-SSAV 的准确性优于 SSAR、SSAV、SDM-SSAR、混合 ARIMA-LSTM 和 VARI。

{"title":"State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export","authors":"Yoga Sasmita, Heri Kuswanto, D. Prastyo","doi":"10.3390/forecast6010009","DOIUrl":"https://doi.org/10.3390/forecast6010009","url":null,"abstract":"Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"131 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139843007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export 基于奇异谱分析矢量的结构性断裂建模的状态依赖模型：印度尼西亚出口预测

Forecasting

Pub Date : 2024-02-12 DOI: 10.3390/forecast6010009

Yoga Sasmita, Heri Kuswanto, D. Prastyo

Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.

标准的时间序列建模要求模型参数随时间变化保持稳定。模型参数的不稳定性往往由结构断裂引起，从而形成非线性模型。状态依赖模型（SDM）是非线性建模中一种更通用、更灵活的方案。另一方面，时间序列数据通常表现出多种频率成分，如趋势、季节性、周期和噪声。在预测过程中，可以使用奇异频谱分析法（SSA）对这些频率成分进行优化。此外，SSA 中使用最广泛的两种方法是线性递归公式（SSAR）和矢量（SSAV）。与 SSAR 相比，SSAV 具有更好的准确性和鲁棒性，尤其是在处理结构断裂时。因此，本研究提出用 SDM 方法对 SSAV 系数建模，以处理结构断点，称为 SDM-SSAV。SDM 使用扩展卡尔曼滤波器（EKF）递归更新 SSAV 系数，以适应不同时间和不同状态。印尼出口数据的经验结果和模拟研究表明，SDM-SSAV 的准确性优于 SSAR、SSAV、SDM-SSAR、混合 ARIMA-LSTM 和 VARI。

{"title":"State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export","authors":"Yoga Sasmita, Heri Kuswanto, D. Prastyo","doi":"10.3390/forecast6010009","DOIUrl":"https://doi.org/10.3390/forecast6010009","url":null,"abstract":"Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"15 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139782992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bootstrapping Long-Run Covariance of Stationary Functional Time Series 对静态函数时间序列的长期协方差进行引导

Forecasting

Pub Date : 2024-02-05 DOI: 10.3390/forecast6010008

Han Lin Shang

A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.

静态函数时间序列的一个关键汇总统计量是衡量序列依赖性的长期协方差函数。它可以通过核三明治估计器进行一致估计，而核三明治估计器正是预测函数时间序列的动态函数主成分回归的核心。为了衡量长期协方差估计的不确定性，我们考虑采用筛法和函数自回归（FAR）引导法生成伪函数时间序列，并研究与长期协方差相关的变异性。筛自举法是非参数法（即无模型），而 FAR 自举法是半参数法。筛式自举法依靠函数主成分分析将函数时间序列分解为一组估计的函数主成分及其相关分数。分数可以通过向量自回归表示进行引导。将自举得分乘以估计的功能主成分，即可得到自举功能时间序列。FAR 引导法依赖于阶 1 的 FAR 来模拟函数时间序列的条件均值，而残差函数可以通过独立同分布的重采样进行引导。通过一系列蒙特卡罗模拟，我们评估并比较了筛法和 FAR 引导法在量化静态函数时间序列长期协方差估计不确定性方面的有限样本精度。

{"title":"Bootstrapping Long-Run Covariance of Stationary Functional Time Series","authors":"Han Lin Shang","doi":"10.3390/forecast6010008","DOIUrl":"https://doi.org/10.3390/forecast6010008","url":null,"abstract":"A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"45 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139865747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bootstrapping Long-Run Covariance of Stationary Functional Time Series 对静态函数时间序列的长期协方差进行引导

Forecasting

Pub Date : 2024-02-05 DOI: 10.3390/forecast6010008

Han Lin Shang

A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.

静态函数时间序列的一个关键汇总统计量是衡量序列依赖性的长期协方差函数。它可以通过核三明治估计器进行一致估计，而核三明治估计器正是预测函数时间序列的动态函数主成分回归的核心。为了衡量长期协方差估计的不确定性，我们考虑采用筛法和函数自回归（FAR）引导法生成伪函数时间序列，并研究与长期协方差相关的变异性。筛自举法是非参数法（即无模型），而 FAR 自举法是半参数法。筛式自举法依靠函数主成分分析将函数时间序列分解为一组估计的函数主成分及其相关分数。分数可以通过向量自回归表示进行引导。将自举得分乘以估计的功能主成分，即可得到自举功能时间序列。FAR 引导法依赖于阶 1 的 FAR 来模拟函数时间序列的条件均值，而残差函数可以通过独立同分布的重采样进行引导。通过一系列蒙特卡罗模拟，我们评估并比较了筛法和 FAR 引导法在量化静态函数时间序列长期协方差估计不确定性方面的有限样本精度。

{"title":"Bootstrapping Long-Run Covariance of Stationary Functional Time Series","authors":"Han Lin Shang","doi":"10.3390/forecast6010008","DOIUrl":"https://doi.org/10.3390/forecast6010008","url":null,"abstract":"A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"23 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139805846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Forecasting the Occurrence of Electricity Price Spikes: A Statistical-Economic Investigation Study 预测电价高峰的发生：统计经济调查研究

Forecasting

Pub Date : 2024-02-01 DOI: 10.3390/forecast6010007

Manuel Zamudio López, H. Zareipour, Mike Quashie

This research proposes an investigative experiment employing binary classification for short-term electricity price spike forecasting. Numerical definitions for price spikes are derived from economic and statistical thresholds. The predictive task employs two tree-based machine learning classifiers and a deterministic point forecaster; a statistical regression model. Hyperparameters for the tree-based classifiers are optimized for statistical performance based on recall, precision, and F1-score. The deterministic forecaster is adapted from the literature on electricity price forecasting for the classification task. Additionally, one tree-based model prioritizes interpretability, generating decision rules that are subsequently utilized to produce price spike forecasts. For all models, we evaluate the final statistical and economic predictive performance. The interpretable model is analyzed for the trade-off between performance and interpretability. Numerical results highlight the significance of complementing statistical performance with economic assessment in electricity price spike forecasting. All experiments utilize data from Alberta’s electricity market.

本研究提出了一种采用二元分类法进行短期电价峰值预测的调查实验。价格峰值的数字定义来自经济和统计阈值。预测任务采用了两个基于树的机器学习分类器和一个确定性点预测器；一个统计回归模型。树型分类器的超参数根据召回率、精确度和 F1 分数对统计性能进行了优化。确定性预测器是根据电价预测文献改编的，用于分类任务。此外，一个基于树的模型优先考虑了可解释性，生成了决策规则，随后用于生成价格峰值预测。我们对所有模型的最终统计和经济预测性能进行了评估。我们对可解释模型进行了分析，以权衡性能和可解释性。数值结果凸显了在电价峰值预测中以经济评估补充统计性能的重要性。所有实验均采用阿尔伯塔省电力市场的数据。

{"title":"Forecasting the Occurrence of Electricity Price Spikes: A Statistical-Economic Investigation Study","authors":"Manuel Zamudio López, H. Zareipour, Mike Quashie","doi":"10.3390/forecast6010007","DOIUrl":"https://doi.org/10.3390/forecast6010007","url":null,"abstract":"This research proposes an investigative experiment employing binary classification for short-term electricity price spike forecasting. Numerical definitions for price spikes are derived from economic and statistical thresholds. The predictive task employs two tree-based machine learning classifiers and a deterministic point forecaster; a statistical regression model. Hyperparameters for the tree-based classifiers are optimized for statistical performance based on recall, precision, and F1-score. The deterministic forecaster is adapted from the literature on electricity price forecasting for the classification task. Additionally, one tree-based model prioritizes interpretability, generating decision rules that are subsequently utilized to produce price spike forecasts. For all models, we evaluate the final statistical and economic predictive performance. The interpretable model is analyzed for the trade-off between performance and interpretability. Numerical results highlight the significance of complementing statistical performance with economic assessment in electricity price spike forecasting. All experiments utilize data from Alberta’s electricity market.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"12 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139686651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data-Driven Models to Forecast the Impact of Temperature Anomalies on Rice Production in Southeast Asia 预测温度异常对东南亚水稻生产影响的数据驱动模型

Forecasting

Pub Date : 2024-01-31 DOI: 10.3390/forecast6010006

Sabrina De Nardi, C. Carnevale, Sara Raccagni, L. Sangiorgi

Models are a core element in performing local estimation of the climate change input. In this work, a novel approach to perform a fast downscaling of global temperature anomalies on a regional level is presented. The approach is based on a set of data-driven models linking global temperature anomalies and regional and global emissions to regional temperature anomalies. In particular, due to the limited number of available data, a linear autoregressive structure with exogenous input (ARX) has been considered. To demonstrate their relevance to the existing literature and context, the proposed ARX models have been employed to evaluate the impact of temperature anomalies on rice production in a socially, economically, and climatologically fragile area like Southeast Asia. The results show a significant impact on this region, with estimations strongly in accordance with information presented in the literature from different sources and scientific fields. The work represents a first step towards the development of a fast, data-driven, holistic approach to the climate change impact evaluation problem. The proposed ARX data-driven models reveal a novel and feasible way to downscale global temperature anomalies to regional levels, showing their importance in comprehending global temperature anomalies, emissions, and regional climatic conditions.

模型是对气候变化输入进行本地估算的核心要素。在这项工作中，提出了一种在区域层面对全球温度异常进行快速降尺度的新方法。该方法基于一套数据驱动模型，将全球气温异常、区域和全球排放与区域气温异常联系起来。特别是，由于可用数据数量有限，考虑了具有外生输入（ARX）的线性自回归结构。为了证明其与现有文献和背景的相关性，我们采用了所提出的 ARX 模型来评估气温异常对东南亚等社会、经济和气候脆弱地区水稻生产的影响。结果表明，气温异常对该地区的影响很大，其估算结果与不同来源和科学领域的文献资料十分吻合。这项工作标志着向开发快速、数据驱动、全面的气候变化影响评估方法迈出了第一步。所提出的 ARX 数据驱动模型揭示了一种将全球温度异常降级到区域水平的新颖可行的方法，显示了其在理解全球温度异常、排放和区域气候条件方面的重要性。

{"title":"Data-Driven Models to Forecast the Impact of Temperature Anomalies on Rice Production in Southeast Asia","authors":"Sabrina De Nardi, C. Carnevale, Sara Raccagni, L. Sangiorgi","doi":"10.3390/forecast6010006","DOIUrl":"https://doi.org/10.3390/forecast6010006","url":null,"abstract":"Models are a core element in performing local estimation of the climate change input. In this work, a novel approach to perform a fast downscaling of global temperature anomalies on a regional level is presented. The approach is based on a set of data-driven models linking global temperature anomalies and regional and global emissions to regional temperature anomalies. In particular, due to the limited number of available data, a linear autoregressive structure with exogenous input (ARX) has been considered. To demonstrate their relevance to the existing literature and context, the proposed ARX models have been employed to evaluate the impact of temperature anomalies on rice production in a socially, economically, and climatologically fragile area like Southeast Asia. The results show a significant impact on this region, with estimations strongly in accordance with information presented in the literature from different sources and scientific fields. The work represents a first step towards the development of a fast, data-driven, holistic approach to the climate change impact evaluation problem. The proposed ARX data-driven models reveal a novel and feasible way to downscale global temperature anomalies to regional levels, showing their importance in comprehending global temperature anomalies, emissions, and regional climatic conditions.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"497 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140471262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0