Pub Date : 2025-12-01Epub Date: 2025-07-09DOI: 10.1016/j.csda.2025.108245
Reza Modarres
The Euclidean distance is not a suitable distance for high dimensional settings due to the distance concentration phenomenon. A novel statistic that is inspired by the interpoint distances, but avoids their computation, is proposed for comparing and visualizing high dimensional datasets. The new statistic is based on a high dimensional dissimilarity index that takes advantage of the concentration phenomenon. A simultaneous display of observations means and standard deviations that aids visualization, detection of suspect outliers, and enhances separability among the competing classes in the transformed space is discussed. The finite sample convergence of the dissimilarity indices is studied, nine statistics are compared under several distributions, and three applications are presented.
{"title":"Testing the equality of high dimensional distributions","authors":"Reza Modarres","doi":"10.1016/j.csda.2025.108245","DOIUrl":"10.1016/j.csda.2025.108245","url":null,"abstract":"<div><div>The Euclidean distance is not a suitable distance for high dimensional settings due to the distance concentration phenomenon. A novel statistic that is inspired by the interpoint distances, but avoids their computation, is proposed for comparing and visualizing high dimensional datasets. The new statistic is based on a high dimensional dissimilarity index that takes advantage of the concentration phenomenon. A simultaneous display of observations means and standard deviations that aids visualization, detection of suspect outliers, and enhances separability among the competing classes in the transformed space is discussed. The finite sample convergence of the dissimilarity indices is studied, nine statistics are compared under several distributions, and three applications are presented.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108245"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-07-07DOI: 10.1016/j.csda.2025.108240
Zerui Zhang , Xin Wang , Xin Zhang , Jing Zhang
In the realm of large-scale spatiotemporal data, abrupt changes are commonly occurring across both spatial and temporal domains. To address the concurrent challenges of detecting change points and identifying spatial clusters within spatiotemporal count data, an innovative method is introduced based on the Poisson regression model. The proposed method employs doubly fused penalization to unveil the underlying spatiotemporal change patterns. To efficiently estimate the model, an iterative shrinkage and threshold based algorithm is developed to minimize the doubly penalized likelihood function. The reliability and accuracy is confirmed by the statistical consistency properties. Furthermore, extensive numerical experiments are conducted to validate the theoretical findings, thereby highlighting the superior performance of the proposed method when compared to existing competitive approaches.
{"title":"Simultaneously detecting spatiotemporal changes with penalized Poisson regression models","authors":"Zerui Zhang , Xin Wang , Xin Zhang , Jing Zhang","doi":"10.1016/j.csda.2025.108240","DOIUrl":"10.1016/j.csda.2025.108240","url":null,"abstract":"<div><div>In the realm of large-scale spatiotemporal data, abrupt changes are commonly occurring across both spatial and temporal domains. To address the concurrent challenges of detecting change points and identifying spatial clusters within spatiotemporal count data, an innovative method is introduced based on the Poisson regression model. The proposed method employs doubly fused penalization to unveil the underlying spatiotemporal change patterns. To efficiently estimate the model, an iterative shrinkage and threshold based algorithm is developed to minimize the doubly penalized likelihood function. The reliability and accuracy is confirmed by the statistical consistency properties. Furthermore, extensive numerical experiments are conducted to validate the theoretical findings, thereby highlighting the superior performance of the proposed method when compared to existing competitive approaches.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108240"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-07-01DOI: 10.1016/j.csda.2025.108237
Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph T. Meyer
Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during tree construction. Motivated from this, it is argued that simple alternative partitioning schemes used in the tree growing procedure can enhance identification of these interactions. In a simulation study these variants are compared to conventional Random Forests and Extremely Randomized Trees. The results validate that the modifications considered enhance the model's fitting ability in scenarios where pure interactions play a crucial role. Finally, the methods are applied to real datasets.
{"title":"Pure interaction effects unseen by Random Forests","authors":"Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph T. Meyer","doi":"10.1016/j.csda.2025.108237","DOIUrl":"10.1016/j.csda.2025.108237","url":null,"abstract":"<div><div>Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during tree construction. Motivated from this, it is argued that simple alternative partitioning schemes used in the tree growing procedure can enhance identification of these interactions. In a simulation study these variants are compared to conventional Random Forests and Extremely Randomized Trees. The results validate that the modifications considered enhance the model's fitting ability in scenarios where pure interactions play a crucial role. Finally, the methods are applied to real datasets.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108237"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-06-02DOI: 10.1016/j.csda.2025.108219
Tom Stindl , Zelong Bi , Clara Grazian
Spatiotemporal Renewal Epidemic Type Aftershock Sequence models are self-exciting point processes that model the occurrence time, epicenter, and magnitude of earthquakes in a geographical region. The arrival rate of earthquakes is formulated as the superposition of a main shock renewal process and homogeneous Poisson processes for the aftershocks, motivated by empirical laws in seismology. Existing methods for model fitting rely on maximizing the log-likelihood by either direct numerical optimization or Expectation Maximization algorithms, both of which can suffer from convergence issues and lack adequate quantification of parameter estimation uncertainty. To address these limitations, a Bayesian approach is employed, with posterior inference carried out using a data augmentation strategy within a Markov chain Monte Carlo framework. The branching structure is treated as a latent variable to improve sampling efficiency, and a purpose-built Hamiltonian Monte Carlo sampler is implemented to update the parameters within the Gibbs sampler. This methodology enables parameter uncertainty to be incorporated into forecasts of seismicity. Estimation and forecasting are demonstrated on simulated catalogs and an earthquake catalog from Italy. R code implementing the methods is provided in the Supplementary Materials.
{"title":"Bayesian forecasting of Italian seismicity using the spatiotemporal RETAS model","authors":"Tom Stindl , Zelong Bi , Clara Grazian","doi":"10.1016/j.csda.2025.108219","DOIUrl":"10.1016/j.csda.2025.108219","url":null,"abstract":"<div><div>Spatiotemporal Renewal Epidemic Type Aftershock Sequence models are self-exciting point processes that model the occurrence time, epicenter, and magnitude of earthquakes in a geographical region. The arrival rate of earthquakes is formulated as the superposition of a main shock renewal process and homogeneous Poisson processes for the aftershocks, motivated by empirical laws in seismology. Existing methods for model fitting rely on maximizing the log-likelihood by either direct numerical optimization or Expectation Maximization algorithms, both of which can suffer from convergence issues and lack adequate quantification of parameter estimation uncertainty. To address these limitations, a Bayesian approach is employed, with posterior inference carried out using a data augmentation strategy within a Markov chain Monte Carlo framework. The branching structure is treated as a latent variable to improve sampling efficiency, and a purpose-built Hamiltonian Monte Carlo sampler is implemented to update the parameters within the Gibbs sampler. This methodology enables parameter uncertainty to be incorporated into forecasts of seismicity. Estimation and forecasting are demonstrated on simulated catalogs and an earthquake catalog from Italy. <span>R</span> code implementing the methods is provided in the Supplementary Materials.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108219"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144261610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-06-19DOI: 10.1016/j.csda.2025.108234
Qihu Zhang , Jongik Chung , Cheolwoo Park
Methods are proposed for estimating multiple precision matrices for long-memory time series, with particular emphasis on the analysis of resting-state functional magnetic resonance imaging (fMRI) data obtained from multiple subjects. The objective is to estimate both individual brain networks and a common structure representative of a group. Several approaches employing weighted aggregation are introduced to simultaneously estimate individual and group-level precision matrices. Convergence rates of the estimators are examined under various norms and expectations, and their performance is evaluated under both sub-Gaussian and heavy-tailed distributions. The proposed methods are demonstrated through simulated data and real resting-state fMRI datasets.
{"title":"Joint estimation of precision matrices for long-memory time series","authors":"Qihu Zhang , Jongik Chung , Cheolwoo Park","doi":"10.1016/j.csda.2025.108234","DOIUrl":"10.1016/j.csda.2025.108234","url":null,"abstract":"<div><div>Methods are proposed for estimating multiple precision matrices for long-memory time series, with particular emphasis on the analysis of resting-state functional magnetic resonance imaging (fMRI) data obtained from multiple subjects. The objective is to estimate both individual brain networks and a common structure representative of a group. Several approaches employing weighted aggregation are introduced to simultaneously estimate individual and group-level precision matrices. Convergence rates of the estimators are examined under various norms and expectations, and their performance is evaluated under both sub-Gaussian and heavy-tailed distributions. The proposed methods are demonstrated through simulated data and real resting-state fMRI datasets.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108234"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-07-04DOI: 10.1016/j.csda.2025.108244
Meiling Hao , Ruiyu Yang , Fangfang Bai , Liuquan Sun
In the realm of high-throughput genomic data, modeling with ultrahigh-dimensional covariates and censored survival outcomes is of great importance. We conduct conditional inference for the ultrahigh-dimensional additive hazards model, allowing both the covariates of interest and nuisance covariates to be ultrahigh-dimensional. The presence of right censorship with survival outcomes adds an extra layer of complexity to the original data structure, posing significant challenges for the ultrahigh-dimensional additive hazards model. To address this, we introduce an innovative test statistic based on the quadratic norm of the score function. Moreover, when there is a high correlation between the covariates of interest and nuisance covariates, we propose a decorrelated score function-based test statistic to enhance statistical power. Additionally, we establish the limiting distributions of the test statistics under both the null and local alternative hypotheses, further enhancing the computational appeal of our approach. The proposed statistics are thoroughly evaluated through extensive simulation studies and applied to two real data examples.
{"title":"Conditional inference for ultrahigh-dimensional additive hazards model","authors":"Meiling Hao , Ruiyu Yang , Fangfang Bai , Liuquan Sun","doi":"10.1016/j.csda.2025.108244","DOIUrl":"10.1016/j.csda.2025.108244","url":null,"abstract":"<div><div>In the realm of high-throughput genomic data, modeling with ultrahigh-dimensional covariates and censored survival outcomes is of great importance. We conduct conditional inference for the ultrahigh-dimensional additive hazards model, allowing both the covariates of interest and nuisance covariates to be ultrahigh-dimensional. The presence of right censorship with survival outcomes adds an extra layer of complexity to the original data structure, posing significant challenges for the ultrahigh-dimensional additive hazards model. To address this, we introduce an innovative test statistic based on the quadratic norm of the score function. Moreover, when there is a high correlation between the covariates of interest and nuisance covariates, we propose a decorrelated score function-based test statistic to enhance statistical power. Additionally, we establish the limiting distributions of the test statistics under both the null and local alternative hypotheses, further enhancing the computational appeal of our approach. The proposed statistics are thoroughly evaluated through extensive simulation studies and applied to two real data examples.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108244"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-07-11DOI: 10.1016/j.csda.2025.108246
J.C. Luengo , D. Ramos-López , R. Rumí
A new approach to modeling continuous distributions in hybrid Bayesian networks (BNs) is presented. It is based on Mixtures of Polynomials (MoPs) with tails, named as tMoPs. This proposal is a variation of the usual MoP model, now including tails and several other improvements in the learning process. The adequate modeling of tails in variable distributions is relevant theoretically and for many reals applications, in which rare phenomena may have a great impact. The proposed approach has been designed to exploit the flexibility of the tMoP model to fit different continuous data distributions. This is especially relevant in those distributions with zones of density close to zero, in which polynomial fitting may be difficult. In these situations, tMoPs allow a polynomial fit in parts with higher density and the use of tails in areas with lower density. This permits a better global fit, without loss of overall accuracy and yielding a relatively simple density function. Learning algorithms for tMoPs conditional probability distributions with up to two parents of any type are developed. These tMoPs may be integrated into hybrid Bayesian networks to represent conditional probability distributions, thus allowing to perform probabilistic reasoning, such as causal inference, sensitivity analysis, and other decision-making operations. The suitability of tMoPs is evaluated in several ways, using a large set of real datasets with data of different natures. The experiments include: the analysis of goodness-of-fit with several continuous and pseudo-continuous variables, the optimization of certain parameters and the effect of variable selection and graph structure when using tMoPs in BNs, and finally the evaluation of the predictive ability of hybrid BNs based on tMoPs in classification and regression. Results show the good behavior of our proposal, with the tMoP hybrid Bayesian networks being equally accurate or outperforming other techniques in most scenarios, in addition to providing a more informative and convenient probabilistic model.
{"title":"Modeling continuous distributions in hybrid Bayesian networks using mixtures of polynomials with tails","authors":"J.C. Luengo , D. Ramos-López , R. Rumí","doi":"10.1016/j.csda.2025.108246","DOIUrl":"10.1016/j.csda.2025.108246","url":null,"abstract":"<div><div>A new approach to modeling continuous distributions in hybrid Bayesian networks (BNs) is presented. It is based on Mixtures of Polynomials (MoPs) with tails, named as tMoPs. This proposal is a variation of the usual MoP model, now including tails and several other improvements in the learning process. The adequate modeling of tails in variable distributions is relevant theoretically and for many reals applications, in which rare phenomena may have a great impact. The proposed approach has been designed to exploit the flexibility of the tMoP model to fit different continuous data distributions. This is especially relevant in those distributions with zones of density close to zero, in which polynomial fitting may be difficult. In these situations, tMoPs allow a polynomial fit in parts with higher density and the use of tails in areas with lower density. This permits a better global fit, without loss of overall accuracy and yielding a relatively simple density function. Learning algorithms for tMoPs conditional probability distributions with up to two parents of any type are developed. These tMoPs may be integrated into hybrid Bayesian networks to represent conditional probability distributions, thus allowing to perform probabilistic reasoning, such as causal inference, sensitivity analysis, and other decision-making operations. The suitability of tMoPs is evaluated in several ways, using a large set of real datasets with data of different natures. The experiments include: the analysis of goodness-of-fit with several continuous and pseudo-continuous variables, the optimization of certain parameters and the effect of variable selection and graph structure when using tMoPs in BNs, and finally the evaluation of the predictive ability of hybrid BNs based on tMoPs in classification and regression. Results show the good behavior of our proposal, with the tMoP hybrid Bayesian networks being equally accurate or outperforming other techniques in most scenarios, in addition to providing a more informative and convenient probabilistic model.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108246"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-07-11DOI: 10.1016/j.csda.2025.108249
Changwon Yoon , Hyunbin Choi , Jeongyoun Ahn
Compositional data—measurements of relative proportions among components—arise frequently in fields ranging from chemometrics to bioinformatics. While density estimation of such data provides crucial insights into their underlying patterns and enables comparative analyses across groups, existing nonparametric approaches are limited, particularly in handling zero components that commonly occur in real-world datasets. We propose a novel kernel density estimation (KDE) method for compositional data that naturally accommodates zero components by exploiting the geometric correspondence between simplices and hyperspheres. This connection to spherical KDE allows us to establish theoretical guarantees, including consistency of the estimator. Through extensive simulations and real data analyses, we demonstrate our method's advantages over existing approaches, particularly in scenarios involving zero components.
{"title":"Kernel density estimation for compositional data with zeros via hypersphere mapping","authors":"Changwon Yoon , Hyunbin Choi , Jeongyoun Ahn","doi":"10.1016/j.csda.2025.108249","DOIUrl":"10.1016/j.csda.2025.108249","url":null,"abstract":"<div><div>Compositional data—measurements of relative proportions among components—arise frequently in fields ranging from chemometrics to bioinformatics. While density estimation of such data provides crucial insights into their underlying patterns and enables comparative analyses across groups, existing nonparametric approaches are limited, particularly in handling zero components that commonly occur in real-world datasets. We propose a novel kernel density estimation (KDE) method for compositional data that naturally accommodates zero components by exploiting the geometric correspondence between simplices and hyperspheres. This connection to spherical KDE allows us to establish theoretical guarantees, including consistency of the estimator. Through extensive simulations and real data analyses, we demonstrate our method's advantages over existing approaches, particularly in scenarios involving zero components.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108249"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.
{"title":"Bayesian selection approach for categorical responses via multinomial probit models","authors":"Chi-Hsiang Chu , Kuo-Jung Lee , Chien-Chin Hsu , Ray-Bing Chen","doi":"10.1016/j.csda.2025.108233","DOIUrl":"10.1016/j.csda.2025.108233","url":null,"abstract":"<div><div>A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108233"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-06-16DOI: 10.1016/j.csda.2025.108231
Giuseppina Albano , Virginia Giorno , Gema Pérez-Romero , Francisco de Asis Torres-Ruiz
A Susceptible-Infected-Removed stochastic model is presented, in which the stochasticity is introduced through two independent Brownian motions in the dynamics of the Susceptible and Infected populations. To account for the natural evolution of the Susceptible population, a growth function is considered in which size is influenced by the birth and death of individuals. Inference for such a model is addressed by means of a Quasi Maximum Likelihood Estimation (QMLE) method. The resulting nonlinear system can be numerically solved by iterative procedures. A technique to obtain the initial solutions usually required by such methods is also provided. Finally, simulation studies are performed for three well-known growth functions, namely Gompertz, Logistic and Bertalanffy curves. The performance of the initial estimates of the involved parameters is assessed, and the goodness of the proposed methodology is evaluated.
提出了一种易感-感染-去除随机模型,该模型通过易感种群和感染种群动力学中的两个独立布朗运动引入随机性。为了解释易感群体的自然进化,考虑了一个生长函数,其中大小受个体出生和死亡的影响。利用拟极大似然估计(Quasi Maximum Likelihood Estimation, QMLE)方法解决了该模型的推理问题。所得到的非线性系统可以通过迭代过程进行数值求解。本文还提供了一种获得这些方法通常需要的初始解的技术。最后,对Gompertz曲线、Logistic曲线和Bertalanffy曲线这三种著名的生长函数进行了仿真研究。评估了所涉及参数的初始估计的性能,并评估了所提出方法的优点。
{"title":"Inference on a stochastic SIR model including growth curves","authors":"Giuseppina Albano , Virginia Giorno , Gema Pérez-Romero , Francisco de Asis Torres-Ruiz","doi":"10.1016/j.csda.2025.108231","DOIUrl":"10.1016/j.csda.2025.108231","url":null,"abstract":"<div><div>A Susceptible-Infected-Removed stochastic model is presented, in which the stochasticity is introduced through two independent Brownian motions in the dynamics of the Susceptible and Infected populations. To account for the natural evolution of the Susceptible population, a growth function is considered in which size is influenced by the birth and death of individuals. Inference for such a model is addressed by means of a Quasi Maximum Likelihood Estimation (QMLE) method. The resulting nonlinear system can be numerically solved by iterative procedures. A technique to obtain the initial solutions usually required by such methods is also provided. Finally, simulation studies are performed for three well-known growth functions, namely Gompertz, Logistic and Bertalanffy curves. The performance of the initial estimates of the involved parameters is assessed, and the goodness of the proposed methodology is evaluated.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108231"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}