首页 > 最新文献

Annals of Data Science最新文献

英文 中文
On Poisson Moment Exponential Distribution with Associated Regression and INAR(1) Process 带关联回归和INAR(1)过程的泊松矩指数分布
Q1 Decision Sciences Pub Date : 2023-06-08 DOI: 10.1007/s40745-023-00476-2
R. Maya, Jie Huang, M. R. Irshad, Fukang Zhu

Numerous studies have emphasised the significance of count data modeling and its applications to phenomena that occur in the real world. From this perspective, this article examines the traits and applications of the Poisson-moment exponential (PME) distribution in the contexts of time series analysis and regression analysis for real-world phenomena. The PME distribution is a novel one-parameter discrete distribution that can be used as a powerful alternative for the existing distributions for modeling over-dispersed count datasets. The advantages of the PME distribution, including the simplicity of the probability mass function and the explicit expressions of the functions of all the statistical properties, drove us to develop the inferential aspects and learn more about its practical applications. The unknown parameter is estimated using both maximum likelihood and moment estimation methods. Also, we present a parametric regression model based on the PME distribution for the count datasets. To strengthen the utility of the suggested distribution, we propose a new first-order integer-valued autoregressive (INAR(1)) process with PME innovations based on binomial thinning for modeling integer-valued time series with over-dispersion. Application to four real datasets confirms the empirical significance of the proposed model.

许多研究都强调了计数数据建模及其在现实世界现象中应用的重要性。从这个角度出发,本文探讨了泊松-幂指数(PME)分布在时间序列分析和现实世界现象回归分析中的特征和应用。PME 分布是一种新颖的单参数离散分布,可作为现有分布的有力替代,用于对过度分散的计数数据集建模。PME 分布的优点,包括概率质量函数的简单性和所有统计属性函数的明确表达,促使我们开发推论方面的内容,并了解其更多的实际应用。我们使用最大似然法和矩估计法来估计未知参数。此外,我们还针对计数数据集提出了基于 PME 分布的参数回归模型。为了加强所建议分布的实用性,我们提出了一种新的一阶整数值自回归(INAR(1))过程,该过程具有基于二项稀疏的 PME 创新,可用于对具有过度分散性的整数值时间序列建模。对四个真实数据集的应用证实了所提模型的经验意义。
{"title":"On Poisson Moment Exponential Distribution with Associated Regression and INAR(1) Process","authors":"R. Maya,&nbsp;Jie Huang,&nbsp;M. R. Irshad,&nbsp;Fukang Zhu","doi":"10.1007/s40745-023-00476-2","DOIUrl":"10.1007/s40745-023-00476-2","url":null,"abstract":"<div><p>Numerous studies have emphasised the significance of count data modeling and its applications to phenomena that occur in the real world. From this perspective, this article examines the traits and applications of the Poisson-moment exponential (PME) distribution in the contexts of time series analysis and regression analysis for real-world phenomena. The PME distribution is a novel one-parameter discrete distribution that can be used as a powerful alternative for the existing distributions for modeling over-dispersed count datasets. The advantages of the PME distribution, including the simplicity of the probability mass function and the explicit expressions of the functions of all the statistical properties, drove us to develop the inferential aspects and learn more about its practical applications. The unknown parameter is estimated using both maximum likelihood and moment estimation methods. Also, we present a parametric regression model based on the PME distribution for the count datasets. To strengthen the utility of the suggested distribution, we propose a new first-order integer-valued autoregressive (INAR(1)) process with PME innovations based on binomial thinning for modeling integer-valued time series with over-dispersion. Application to four real datasets confirms the empirical significance of the proposed model.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1741 - 1759"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Compound Distribution and Its Applications in Over-dispersed Count Data 一种新的复合分布及其在过分散计数数据中的应用
Q1 Decision Sciences Pub Date : 2023-06-07 DOI: 10.1007/s40745-023-00478-0
Peer Bilal Ahmad, Mohammad Kafeel Wani

Every time variance exceeds mean, over-dispersed models are typically employed. This is the reason that over-dispersed models are such an important aspect of statistical modeling. In this work, the parameter of Poisson distribution is assumed to follow a new lifespan distribution called as Chris-Jerry distribution. The resulting compound distribution is an over-dispersed model known as the Poisson-Chris-Jerry distribution. As a result of deriving a general expression for the r th factorial moment, we acquired the moments about origin and the central moments. In addition to this, moment’s related measurements, generating functions, over-dispersion property, reliability characteristics, recurrence relation for probability, and other statistical qualities, have also been described. For the goal of estimating parameter of the suggested model, the maximum likelihood estimation and method of moment estimation have been addressed. The usefulness of maximum likelihood estimates has also been taken into consideration through a simulation study. We employed four real life data sets, examined the goodness-of-fit test, and considered additional standards such as the Akaike’s information criterion and Bayesian information criterion. The outcomes are compared with several potential models.

每当方差超过均值时,通常就会采用过度分散模型。这就是超分散模型在统计建模中如此重要的原因。在本研究中,我们假设泊松分布的参数遵循一种新的寿命分布,即克里斯-杰里分布。由此产生的复合分布是一种称为泊松-克里斯-杰里分布的过度分散模型。通过推导 rth 系数矩的一般表达式,我们获得了关于原点的矩和中心矩。除此之外,还描述了矩的相关测量、生成函数、超分散特性、可靠性特征、概率递推关系和其他统计特性。为了估算建议模型的参数,研究人员采用了最大似然估算法和矩估算法。我们还通过模拟研究来考虑最大似然估计的实用性。我们采用了四个真实数据集,检验了拟合优度,并考虑了其他标准,如 Akaike 信息准则和贝叶斯信息准则。研究结果与几个潜在模型进行了比较。
{"title":"A New Compound Distribution and Its Applications in Over-dispersed Count Data","authors":"Peer Bilal Ahmad,&nbsp;Mohammad Kafeel Wani","doi":"10.1007/s40745-023-00478-0","DOIUrl":"10.1007/s40745-023-00478-0","url":null,"abstract":"<div><p>Every time variance exceeds mean, over-dispersed models are typically employed. This is the reason that over-dispersed models are such an important aspect of statistical modeling. In this work, the parameter of Poisson distribution is assumed to follow a new lifespan distribution called as Chris-Jerry distribution. The resulting compound distribution is an over-dispersed model known as the Poisson-Chris-Jerry distribution. As a result of deriving a general expression for the <i>r th</i> factorial moment, we acquired the moments about origin and the central moments. In addition to this, moment’s related measurements, generating functions, over-dispersion property, reliability characteristics, recurrence relation for probability, and other statistical qualities, have also been described. For the goal of estimating parameter of the suggested model, the maximum likelihood estimation and method of moment estimation have been addressed. The usefulness of maximum likelihood estimates has also been taken into consideration through a simulation study. We employed four real life data sets, examined the goodness-of-fit test, and considered additional standards such as the Akaike’s information criterion and Bayesian information criterion. The outcomes are compared with several potential models.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1799 - 1820"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46822534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys 基于时间的调查中先验信息在人口均值估计中的应用
Q1 Decision Sciences Pub Date : 2023-06-05 DOI: 10.1007/s40745-023-00472-6
Sanjay Kumar, Priyanka Chhaparwal

Use of a priori information is very common at an estimation stage to form an estimator of a population parameter. Estimation problems can lead to more accurate and efficient estimates using prior information. In this study, we utilized the information from the past surveys along with the information available from the current surveys in the form of a hybrid exponentially weighted moving average to suggest the estimator of the population mean using a known coefficient of variation of the study variable for time-based surveys. We derived the expression of the mean square error of the suggested estimator and established the mathematical conditions to prove the efficiency of the suggested estimator. The results showed that the utilization of information from past surveys and current surveys excels the estimator's efficiency. A simulation study and a real-life example are provided to support using the suggested estimator.

在估算阶段,使用先验信息来形成人口参数的估算值是非常常见的。利用先验信息可以更准确、更有效地估计估计值,从而解决估计问题。在本研究中,我们以混合指数加权移动平均法的形式,利用过去调查的信息和当前调查的信息,通过已知的研究变量变异系数,为基于时间的调查提出了人口平均值的估计值。我们推导出了建议估计器的均方误差表达式,并建立了数学条件来证明建议估计器的效率。结果表明,利用过去调查和当前调查的信息可以提高估计器的效率。研究还提供了一个模拟研究和一个实际案例,以支持使用所建议的估计器。
{"title":"Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys","authors":"Sanjay Kumar,&nbsp;Priyanka Chhaparwal","doi":"10.1007/s40745-023-00472-6","DOIUrl":"10.1007/s40745-023-00472-6","url":null,"abstract":"<div><p>Use of a priori information is very common at an estimation stage to form an estimator of a population parameter. Estimation problems can lead to more accurate and efficient estimates using prior information. In this study, we utilized the information from the past surveys along with the information available from the current surveys in the form of a hybrid exponentially weighted moving average to suggest the estimator of the population mean using a known coefficient of variation of the study variable for time-based surveys. We derived the expression of the mean square error of the suggested estimator and established the mathematical conditions to prove the efficiency of the suggested estimator. The results showed that the utilization of information from past surveys and current surveys excels the estimator's efficiency. A simulation study and a real-life example are provided to support using the suggested estimator.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1675 - 1685"},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45425769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems 梯子:基于日志的企业系统异常检测和诊断
Q1 Decision Sciences Pub Date : 2023-06-04 DOI: 10.1007/s40745-023-00471-7
Sakib A. Mondal, Prashanth Rv, Sagar Rao, Arun Menon

Enterprise software can fail due to not only malfunction of application servers, but also due to performance degradation or non-availability of other servers or middle layers. Consequently, valuable time and resources are wasted in trying to identify the root cause of software failures. To address this, we have developed a framework called LADDERS. In LADDERS, anomalous incidents are detected from log events generated by various systems and KPIs (Key Performance Indicators) through an ensemble of supervised and unsupervised models. Without transaction identifiers, it is not possible to relate various events from different systems. LADDERS implements Recursive Parallel Causal Discovery (RPCD) to establish causal relationships among log events. The framework builds coresets using BICO to manage high volumes of log data during training and inferencing. An anomaly can cause a number of anomalies throughout the systems. LADDERS makes use of RPCD again to discover causal relationships among these anomalous events. Probable root causes are revealed from the causal graph and anomaly rating of events using a k-shortest path algorithm. We evaluated LADDERS using live logs from an enterprise system. The results demonstrate its effectiveness and efficiency for anomaly detection.

企业软件出现故障的原因不仅包括应用服务器的故障,还包括其他服务器或中间层的性能下降或不可用。因此,宝贵的时间和资源都浪费在了试图找出软件故障的根本原因上。为了解决这个问题,我们开发了一个名为 LADDERS 的框架。在 LADDERS 中,我们通过一组监督和非监督模型,从各种系统和 KPI(关键性能指标)生成的日志事件中检测异常事件。如果没有事务标识符,就无法将来自不同系统的各种事件联系起来。LADDERS 实现了递归并行因果发现(RPCD),以建立日志事件之间的因果关系。该框架使用 BICO 构建核心集,以便在训练和推断过程中管理大量日志数据。一个异常可能会导致整个系统出现一系列异常。LADDERS 再次利用 RPCD 发现这些异常事件之间的因果关系。利用 k 最短路径算法,从因果图和异常事件评级中揭示出可能的根本原因。我们使用企业系统的实时日志对 LADDERS 进行了评估。结果证明了它在异常检测方面的有效性和效率。
{"title":"LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems","authors":"Sakib A. Mondal,&nbsp;Prashanth Rv,&nbsp;Sagar Rao,&nbsp;Arun Menon","doi":"10.1007/s40745-023-00471-7","DOIUrl":"10.1007/s40745-023-00471-7","url":null,"abstract":"<div><p>Enterprise software can fail due to not only malfunction of application servers, but also due to performance degradation or non-availability of other servers or middle layers. Consequently, valuable time and resources are wasted in trying to identify the root cause of software failures. To address this, we have developed a framework called LADDERS. In LADDERS, anomalous incidents are detected from log events generated by various systems and KPIs (Key Performance Indicators) through an ensemble of supervised and unsupervised models. Without transaction identifiers, it is not possible to relate various events from different systems. LADDERS implements Recursive Parallel Causal Discovery (RPCD) to establish causal relationships among log events. The framework builds coresets using BICO to manage high volumes of log data during training and inferencing. An anomaly can cause a number of anomalies throughout the systems. LADDERS makes use of RPCD again to discover causal relationships among these anomalous events. Probable root causes are revealed from the causal graph and anomaly rating of events using a k-shortest path algorithm. We evaluated LADDERS using live logs from an enterprise system. The results demonstrate its effectiveness and efficiency for anomaly detection.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1165 - 1183"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46232475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model 使用修改后的 SIS 模型对累计感染病例进行跳跃式下降调整预测
Q1 Decision Sciences Pub Date : 2023-05-15 DOI: 10.1007/s40745-023-00467-3
Rashi Mohta, Sravya Prathapani, Palash Ghosh

Accurate prediction of cumulative COVID-19 infected cases is essential for effectively managing the limited healthcare resources in India. Historically, epidemiological models have helped in controlling such epidemics. Models require accurate historical data to predict future outcomes. In our data, there were days exhibiting erratic, apparently anomalous jumps and drops in the number of daily reported COVID-19 infected cases that did not conform with the overall trend. Including those observations in the training data would most likely worsen model predictive accuracy. However, with existing epidemiological models it is not straightforward to determine, for a specific day, whether or not an outcome should be considered anomalous. In this work, we propose an algorithm to automatically identify anomalous ‘jump’ and ‘drop’ days, and then based upon the overall trend, the number of daily infected cases for those days is adjusted and the training data is amended using the adjusted observations. We applied the algorithm in conjunction with a recently proposed, modified Susceptible-Infected-Susceptible (SIS) model to demonstrate that prediction accuracy is improved after adjusting training data counts for apparent erratic anomalous jumps and drops.

准确预测 COVID-19 的累积感染病例对于有效管理印度有限的医疗资源至关重要。从历史上看,流行病学模型有助于控制此类流行病。模型需要准确的历史数据来预测未来的结果。在我们的数据中,有几天每天报告的 COVID-19 感染病例数出现了不稳定、明显反常的跳跃和下降,这与总体趋势不符。将这些观测数据纳入训练数据很可能会降低模型的预测准确性。然而,在现有的流行病学模型中,并不能直接确定某一天的结果是否应被视为异常。在这项工作中,我们提出了一种自动识别异常 "跳跃 "日和 "下降 "日的算法,然后根据总体趋势调整这些日子的每日感染病例数,并使用调整后的观测数据修正训练数据。我们将该算法与最近提出的经过修改的易感-感染-易感(SIS)模型结合起来使用,证明在针对明显不规则的异常跳跃和下降调整训练数据计数后,预测的准确性得到了提高。
{"title":"Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model","authors":"Rashi Mohta,&nbsp;Sravya Prathapani,&nbsp;Palash Ghosh","doi":"10.1007/s40745-023-00467-3","DOIUrl":"10.1007/s40745-023-00467-3","url":null,"abstract":"<div><p>Accurate prediction of cumulative COVID-19 infected cases is essential for effectively managing the limited healthcare resources in India. Historically, epidemiological models have helped in controlling such epidemics. Models require accurate historical data to predict future outcomes. In our data, there were days exhibiting erratic, apparently anomalous jumps and drops in the number of daily reported COVID-19 infected cases that did not conform with the overall trend. Including those observations in the training data would most likely worsen model predictive accuracy. However, with existing epidemiological models it is not straightforward to determine, for a specific day, whether or not an outcome should be considered anomalous. In this work, we propose an algorithm to automatically identify anomalous ‘jump’ and ‘drop’ days, and then based upon the overall trend, the number of daily infected cases for those days is adjusted and the training data is amended using the adjusted observations. We applied the algorithm in conjunction with a recently proposed, modified Susceptible-Infected-Susceptible (SIS) model to demonstrate that prediction accuracy is improved after adjusting training data counts for apparent erratic anomalous jumps and drops.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"959 - 978"},"PeriodicalIF":0.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135086225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis 一种基于强化学习和技术分析的股票交易模型
Q1 Decision Sciences Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00469-1
Zahra Pourahmadi, Dariush Fareed, Hamid Reza Mirzaei

This study investigates the potential of using reinforcement learning (RL) to establish a financial trading system (FTS), taking into account the main constraint imposed by the stock market, e.g., transaction costs. More specifically, this paper shows the inferior performance of the pure reinforcement learning model when it is applied in a multi-dimensional and noisy stock market environment. To solve this problem and to get a practical and reasonable trading strategies process, a modified RL model is proposed based on the actor-critic method where we have amended the actor by incorporating three metrics from technical analysis. The results show significant improvement compared with traditional trading strategies. The reliability of the model is verified by experimental results on financial data (S&P500 index) and a fair evaluation of the proposed method and pure RL and three benchmarks is demonstrated. Statistical analysis proves that a combination of a) technical analysis (role-based strategies) and b) RL (machine learning strategies) and c) restricting the action of the RL policy network with a few realistic conditions results in trading decisions with higher investment return rates.

本研究探讨了使用强化学习(RL)建立金融交易系统(FTS)的潜力,同时考虑到股票市场的主要限制因素,如交易成本。更具体地说,本文展示了纯强化学习模型在多维度、高噪声的股市环境中应用时的劣势表现。为了解决这一问题,并获得实用合理的交易策略流程,我们提出了一种基于行为者批判方法的修正 RL 模型。结果表明,与传统交易策略相比,该模型有了明显改善。金融数据(S&P500 指数)的实验结果验证了该模型的可靠性,并对所提出的方法和纯 RL 以及三个基准进行了公平评估。统计分析证明,将 a) 技术分析(基于角色的策略)和 b) RL(机器学习策略)相结合,以及 c) 用一些现实条件限制 RL 策略网络的作用,可以做出投资回报率更高的交易决策。
{"title":"A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis","authors":"Zahra Pourahmadi,&nbsp;Dariush Fareed,&nbsp;Hamid Reza Mirzaei","doi":"10.1007/s40745-023-00469-1","DOIUrl":"10.1007/s40745-023-00469-1","url":null,"abstract":"<div><p>This study investigates the potential of using reinforcement learning (RL) to establish a financial trading system (FTS), taking into account the main constraint imposed by the stock market, e.g., transaction costs. More specifically, this paper shows the inferior performance of the pure reinforcement learning model when it is applied in a multi-dimensional and noisy stock market environment. To solve this problem and to get a practical and reasonable trading strategies process, a modified RL model is proposed based on the actor-critic method where we have amended the actor by incorporating three metrics from technical analysis. The results show significant improvement compared with traditional trading strategies. The reliability of the model is verified by experimental results on financial data (S&amp;P500 index) and a fair evaluation of the proposed method and pure RL and three benchmarks is demonstrated. Statistical analysis proves that a combination of a) technical analysis (role-based strategies) and b) RL (machine learning strategies) and c) restricting the action of the RL policy network with a few realistic conditions results in trading decisions with higher investment return rates.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1653 - 1674"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49174695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm 基于分支定界和遗传算法的平台资源调度方法
Q1 Decision Sciences Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00470-8
Yanfen Zhang, Jinyao Ma, Haibin Zhang, Bin Yue

Platform resource scheduling is an operational research optimization problem of matching tasks and platform resources, which has important applications in production or marketing arrangement layout, combat task planning, etc. The existing algorithms are inflexible in task planning sequence and have poor stability. Aiming at this defect, the branch-and-bound algorithm is combined with the genetic algorithm in this paper. Branch-and-bound algorithm can adaptively adjust the next task to be planned and calculate a variety of feasible task planning sequences. Genetic algorithm is used to assign a platform combination to the selected task. Besides, we put forward a new lower bound calculation method and pruning rule. On the basis of the processing time of the direct successor tasks, the influence of the resource requirements of tasks on the priority of tasks is considered. Numerical experiments show that the proposed algorithm has good performance in platform resource scheduling problem.

平台资源调度是一个任务与平台资源匹配的运筹学优化问题,在生产或营销安排布局、作战任务规划等方面有重要应用。现有算法在任务规划序列上不灵活,稳定性差。针对这一缺陷,本文将分枝定界算法与遗传算法相结合。分枝定界算法可以自适应地调整下一个要计划的任务,并计算出各种可行的任务计划序列。遗传算法用于为所选任务分配平台组合。此外,我们还提出了一种新的下界计算方法和修剪规则。在直接后续任务处理时间的基础上,考虑了任务的资源需求对任务优先级的影响。数值实验表明,该算法在平台资源调度问题上具有良好的性能。
{"title":"Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm","authors":"Yanfen Zhang,&nbsp;Jinyao Ma,&nbsp;Haibin Zhang,&nbsp;Bin Yue","doi":"10.1007/s40745-023-00470-8","DOIUrl":"10.1007/s40745-023-00470-8","url":null,"abstract":"<div><p>Platform resource scheduling is an operational research optimization problem of matching tasks and platform resources, which has important applications in production or marketing arrangement layout, combat task planning, etc. The existing algorithms are inflexible in task planning sequence and have poor stability. Aiming at this defect, the branch-and-bound algorithm is combined with the genetic algorithm in this paper. Branch-and-bound algorithm can adaptively adjust the next task to be planned and calculate a variety of feasible task planning sequences. Genetic algorithm is used to assign a platform combination to the selected task. Besides, we put forward a new lower bound calculation method and pruning rule. On the basis of the processing time of the direct successor tasks, the influence of the resource requirements of tasks on the priority of tasks is considered. Numerical experiments show that the proposed algorithm has good performance in platform resource scheduling problem.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"10 5","pages":"1421 - 1445"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43159033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model 自回归(MC-AR)模型多协变量的贝叶斯估计
Q1 Decision Sciences Pub Date : 2023-05-04 DOI: 10.1007/s40745-023-00468-2
Jitendra Kumar, Ashok Kumar, Varun Agiwal

In present scenario, handling covariate/explanatory variable with the model is one of most important factor to study with the models. The main advantages of covariate are it’s dependency on past observations. So, study variable is modelled after explaining both on own past and past and future observation of covariates. Present paper deals estimation of parameters of autoregressive model with multiple covariates under Bayesian approach. A simulation and empirical study is performed to check the applicability of the model and recorded the better results.

在当前情况下,用模型处理协变量/解释变量是研究模型的最重要因素之一。协变量的主要优点是它依赖于过去的观测数据。因此,研究变量是在解释了自身的过去以及协变量的过去和未来观测值之后建立模型的。本文采用贝叶斯方法对带有多个协变量的自回归模型的参数进行估计。为了检验模型的适用性,本文进行了模拟和实证研究,并记录了较好的结果。
{"title":"Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model","authors":"Jitendra Kumar,&nbsp;Ashok Kumar,&nbsp;Varun Agiwal","doi":"10.1007/s40745-023-00468-2","DOIUrl":"10.1007/s40745-023-00468-2","url":null,"abstract":"<div><p>In present scenario, handling covariate/explanatory variable with the model is one of most important factor to study with the models. The main advantages of covariate are it’s dependency on past observations. So, study variable is modelled after explaining both on own past and past and future observation of covariates. Present paper deals estimation of parameters of autoregressive model with multiple covariates under Bayesian approach. A simulation and empirical study is performed to check the applicability of the model and recorded the better results.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1291 - 1301"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47960675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayes Analysis of Random Walk Model Under Different Error Assumptions 不同误差假设下随机漫步模型的贝叶斯分析
Q1 Decision Sciences Pub Date : 2023-04-22 DOI: 10.1007/s40745-023-00465-5
Praveen Kumar Tripathi, Manika Agarwal

In this paper, the Bayesian analyses for the random walk models have been performed under the assumptions of normal distribution, log-normal distribution and the stochastic volatility model, for the error component, one by one. For the various parameters, in each model, some suitable choices of informative and non-informative priors have been made and the posterior distributions are calculated. For the first two choices of error distribution, the posterior samples are easily obtained by using the gamma generating routine in R software. For a random walk model, having stochastic volatility error, the Gibbs sampling with intermediate independent Metropolis–Hastings steps is employed to obtain the desired posterior samples. The whole procedure is numerically illustrated through a real data set of crude oil prices from April 2014 to March 2022. The models are, then, compared on the basis of their accuracies in forecasting the true values. Among the other choices, the random walk model with stochastic volatile errors outperformed for the data in hand.

本文在正态分布、对数正态分布和随机波动模型的假设条件下,对随机漫步模型的误差分量逐一进行了贝叶斯分析。对于每个模型中的各种参数,我们都选择了合适的信息先验和非信息先验,并计算了后验分布。对于误差分布的前两种选择,使用 R 软件中的伽玛生成例程可以轻松获得后验样本。对于具有随机波动误差的随机漫步模型,则采用具有中间独立 Metropolis-Hastings 步骤的 Gibbs 采样来获得所需的后验样本。整个过程通过 2014 年 4 月至 2022 年 3 月原油价格的真实数据集进行了数值说明。然后,根据模型预测真实值的准确性对其进行比较。在其他选择中,具有随机波动误差的随机漫步模型对当前数据的预测效果更好。
{"title":"A Bayes Analysis of Random Walk Model Under Different Error Assumptions","authors":"Praveen Kumar Tripathi,&nbsp;Manika Agarwal","doi":"10.1007/s40745-023-00465-5","DOIUrl":"10.1007/s40745-023-00465-5","url":null,"abstract":"<div><p>In this paper, the Bayesian analyses for the random walk models have been performed under the assumptions of normal distribution, log-normal distribution and the stochastic volatility model, for the error component, one by one. For the various parameters, in each model, some suitable choices of informative and non-informative priors have been made and the posterior distributions are calculated. For the first two choices of error distribution, the posterior samples are easily obtained by using the gamma generating routine in R software. For a random walk model, having stochastic volatility error, the Gibbs sampling with intermediate independent Metropolis–Hastings steps is employed to obtain the desired posterior samples. The whole procedure is numerically illustrated through a real data set of crude oil prices from April 2014 to March 2022. The models are, then, compared on the basis of their accuracies in forecasting the true values. Among the other choices, the random walk model with stochastic volatile errors outperformed for the data in hand.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1635 - 1652"},"PeriodicalIF":0.0,"publicationDate":"2023-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data 零膨胀过分散计数数据的计数回归和机器学习技术:在生态数据中的应用
Q1 Decision Sciences Pub Date : 2023-04-13 DOI: 10.1007/s40745-023-00464-6
Bonelwa Sidumo, Energy Sonono, Isaac Takaidza

The aim of this study is to investigate the overdispersion problem that is rampant in ecological count data. In order to explore this problem, we consider the most commonly used count regression models: the Poisson, the negative binomial, the zero-inflated Poisson and the zero-inflated negative binomial models. The performance of these count regression models is compared with the four proposed machine learning (ML) regression techniques: random forests, support vector machines, (k-)nearest neighbors and artificial neural networks. The mean absolute error was used to compare the performance of count regression models and ML regression models. The results suggest that ML regression models perform better compared to count regression models. The performance shown by ML regression techniques is a motivation for further research in improving methods and applications in ecological studies.

本研究旨在探讨生态计数数据中普遍存在的过度分散问题。为了探讨这个问题,我们考虑了最常用的计数回归模型:泊松模型、负二项模型、零膨胀泊松模型和零膨胀负二项模型。这些计数回归模型的性能与所提出的四种机器学习(ML)回归技术进行了比较:随机森林、支持向量机、(k-)近邻和人工神经网络。使用平均绝对误差来比较计数回归模型和 ML 回归模型的性能。结果表明,与计数回归模型相比,ML 回归模型的性能更好。ML 回归技术所显示的性能是进一步研究改进生态研究方法和应用的动力。
{"title":"Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data","authors":"Bonelwa Sidumo,&nbsp;Energy Sonono,&nbsp;Isaac Takaidza","doi":"10.1007/s40745-023-00464-6","DOIUrl":"10.1007/s40745-023-00464-6","url":null,"abstract":"<div><p>The aim of this study is to investigate the overdispersion problem that is rampant in ecological count data. In order to explore this problem, we consider the most commonly used count regression models: the Poisson, the negative binomial, the zero-inflated Poisson and the zero-inflated negative binomial models. The performance of these count regression models is compared with the four proposed machine learning (ML) regression techniques: random forests, support vector machines, <span>(k-)</span>nearest neighbors and artificial neural networks. The mean absolute error was used to compare the performance of count regression models and ML regression models. The results suggest that ML regression models perform better compared to count regression models. The performance shown by ML regression techniques is a motivation for further research in improving methods and applications in ecological studies.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"803 - 817"},"PeriodicalIF":0.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00464-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1