Viacheslav Kovtun , Krzysztof Grochla , Mohammed Al-Maitah , Saad Aldosary , Tetiana Gryshchuk
{"title":"Cyber epidemic spread forecasting based on the entropy-extremal dynamic interpretation of the SIR model","authors":"Viacheslav Kovtun , Krzysztof Grochla , Mohammed Al-Maitah , Saad Aldosary , Tetiana Gryshchuk","doi":"10.1016/j.eij.2024.100572","DOIUrl":null,"url":null,"abstract":"<div><div>The spread of a cyber epidemic at an early stage is an uncertain process characterized by a small amount of statistically unreliable data. Nonlinear dynamic models, most commonly the SIR model, are widely used to describe such processes. The description of the studied process obtained using this model is sensitive to the initial conditions set and the quality of tuning the controlled parameters based on the results of operational observations, which are inherently uncertain. This article proposes a transition to a stochastic interpretation of the controlled parameters of the SIR model and the introduction of additional stochastic parameters that represent the variability of operational data measurements. The process of estimating the probability density functions of these parameters and noises is implemented as a strict optimization problem. The resulting mathematical apparatus is generalized in the form of two versions of the entropy-extremal adaptation of the SIR model, which are applied to forecast the spread of a cyber epidemic. The first version is focused on estimating the SIR model parameters based on operational data. In contrast, the second version focuses on stochastic modelling of the transmission rate indicator and its impact on forecasting the studied process. The forecasting result represents the average trajectory from the set of trajectories obtained using the authors’ models, which characterize the dynamics of compartment <em>I</em>. The experimental part of the article compares the classical Least Squares method with the authors’ entropy-extremal approach for estimating the SIR model parameters based on etalon data on the spread of the most threatening categories of malware cyber epidemics. The empirical results are characterized by a significant reduction in the Mean Absolute Percentage Error regarding the etalon data over the prediction interval, which proves the adequacy of the proposed approach.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"28 ","pages":"Article 100572"},"PeriodicalIF":5.0000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S111086652400135X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The spread of a cyber epidemic at an early stage is an uncertain process characterized by a small amount of statistically unreliable data. Nonlinear dynamic models, most commonly the SIR model, are widely used to describe such processes. The description of the studied process obtained using this model is sensitive to the initial conditions set and the quality of tuning the controlled parameters based on the results of operational observations, which are inherently uncertain. This article proposes a transition to a stochastic interpretation of the controlled parameters of the SIR model and the introduction of additional stochastic parameters that represent the variability of operational data measurements. The process of estimating the probability density functions of these parameters and noises is implemented as a strict optimization problem. The resulting mathematical apparatus is generalized in the form of two versions of the entropy-extremal adaptation of the SIR model, which are applied to forecast the spread of a cyber epidemic. The first version is focused on estimating the SIR model parameters based on operational data. In contrast, the second version focuses on stochastic modelling of the transmission rate indicator and its impact on forecasting the studied process. The forecasting result represents the average trajectory from the set of trajectories obtained using the authors’ models, which characterize the dynamics of compartment I. The experimental part of the article compares the classical Least Squares method with the authors’ entropy-extremal approach for estimating the SIR model parameters based on etalon data on the spread of the most threatening categories of malware cyber epidemics. The empirical results are characterized by a significant reduction in the Mean Absolute Percentage Error regarding the etalon data over the prediction interval, which proves the adequacy of the proposed approach.
网络流行病在早期阶段的传播是一个不确定的过程,其特点是有少量统计上不可靠的数据。非线性动态模型,最常见的是 SIR 模型,被广泛用于描述此类过程。使用该模型获得的对所研究过程的描述对设定的初始条件和根据运行观测结果调整受控参数的质量非常敏感,而这些参数本身就具有不确定性。本文建议将 SIR 模型的受控参数转换为随机解释,并引入代表运行数据测量变异性的附加随机参数。对这些参数和噪声的概率密度函数进行估计的过程是一个严格的优化问题。由此产生的数学装置以两个版本的 SIR 模型熵极适应形式进行了推广,并将其应用于预测网络流行病的传播。第一个版本侧重于根据运行数据估算 SIR 模型参数。相比之下,第二个版本侧重于传播率指标的随机建模及其对预测所研究过程的影响。预测结果代表了使用作者模型获得的一组轨迹的平均轨迹,这些轨迹描述了 I 区的动态特征。文章的实验部分将经典的最小二乘法与作者的熵极法进行了比较,后者是基于最具威胁性的恶意软件网络流行病传播的等离子数据来估计 SIR 模型参数的。实证结果的特点是,在预测区间内,等值线数据的平均绝对百分比误差显著降低,这证明了所提方法的适当性。
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.