首页 > 最新文献

Computational Statistics & Data Analysis最新文献

英文 中文
Variable selection for spatio-temporal conditionally Poisson point processes 时空条件泊松点过程的变量选择
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-27 DOI: 10.1016/j.csda.2025.108238
Achmad Choiruddin , Jonatan A. González , Jorge Mateu , Alwan Fadlurohman , Rasmus Waagepetersen
Spatio-temporal point pattern data are becoming prevalent in many scientific disciplines. We consider a sequence of spatial point processes where each point process is Poisson given the past. We model the conditional first-order intensity function of each point process as a parametric log-linear function of spatial, temporal, and spatio-temporal covariates that may depend on previous point patterns. Dealing with spatio-temporal covariates brings computational and methodological challenges compared to the purely spatial case. We extend regularisation methods for spatial point process variable selection to obtain parsimonious and interpretable models in the considered spatio-temporal case. Using our proposed methodology, we conduct two simulation studies and examine an application to criminal activity in the Kennedy district of Bogota. In the application, we consider a spatio-temporal point pattern data of crime locations and a number of spatial, temporal, and spatio-temporal covariates related to urban places, environmental factors, and further space-time factors. The intensity function of vehicle thefts is estimated, considering other crimes as covariate information. The proposed methodology offers a comprehensive approach for analysing spatio-temporal point pattern crime data, capturing complex relationships between covariates and crime occurrences over space and time.
时空点模式数据在许多科学学科中越来越流行。我们考虑一个空间点过程序列,其中每个点过程都是给定过去的泊松过程。我们将每个点过程的条件一阶强度函数建模为空间、时间和时空协变量的参数对数线性函数,这些协变量可能依赖于先前的点模式。与纯粹的空间情况相比,处理时空协变量带来了计算和方法上的挑战。我们扩展了空间点过程变量选择的正则化方法,以在考虑的时空情况下获得简洁和可解释的模型。使用我们提出的方法,我们进行了两次模拟研究,并检查了波哥大肯尼迪区犯罪活动的应用。在应用程序中,我们考虑了犯罪地点的时空点模式数据以及与城市地点、环境因素和进一步的时空因素相关的一些空间、时间和时空协变量。考虑其他犯罪作为协变量信息,估计了车辆盗窃的强度函数。所提出的方法提供了一种全面的方法来分析时空点模式犯罪数据,捕捉协变量和犯罪事件之间的复杂关系。
{"title":"Variable selection for spatio-temporal conditionally Poisson point processes","authors":"Achmad Choiruddin ,&nbsp;Jonatan A. González ,&nbsp;Jorge Mateu ,&nbsp;Alwan Fadlurohman ,&nbsp;Rasmus Waagepetersen","doi":"10.1016/j.csda.2025.108238","DOIUrl":"10.1016/j.csda.2025.108238","url":null,"abstract":"<div><div>Spatio-temporal point pattern data are becoming prevalent in many scientific disciplines. We consider a sequence of spatial point processes where each point process is Poisson given the past. We model the conditional first-order intensity function of each point process as a parametric log-linear function of spatial, temporal, and spatio-temporal covariates that may depend on previous point patterns. Dealing with spatio-temporal covariates brings computational and methodological challenges compared to the purely spatial case. We extend regularisation methods for spatial point process variable selection to obtain parsimonious and interpretable models in the considered spatio-temporal case. Using our proposed methodology, we conduct two simulation studies and examine an application to criminal activity in the Kennedy district of Bogota. In the application, we consider a spatio-temporal point pattern data of crime locations and a number of spatial, temporal, and spatio-temporal covariates related to urban places, environmental factors, and further space-time factors. The intensity function of vehicle thefts is estimated, considering other crimes as covariate information. The proposed methodology offers a comprehensive approach for analysing spatio-temporal point pattern crime data, capturing complex relationships between covariates and crime occurrences over space and time.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108238"},"PeriodicalIF":1.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144535764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A score-based threshold effect test in time series models 时间序列模型中基于分数的阈值效应检验
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-25 DOI: 10.1016/j.csda.2025.108236
Shufang Wei , Yaping Deng , Yaxing Yang
A score-based test statistic is developed to compare a linear ARMA model with its threshold extension. In particular, the focus is on testing the threshold effect in continuous threshold models with no jump at the threshold. Notably, while developed for continuous threshold models, the proposed test remains effective for discontinuous cases. The proposed test does not require fitting the model under the alternative hypothesis, making it computationally more efficient than the quasi-likelihood ratio test. The asymptotic distributions of the score-based test statistic are derived under both the null hypothesis and local alternatives. Simulations indicate that the proposed test has better size than the quasi-likelihood ratio test and demonstrates stronger power compared to the Lagrange Multiplier test. The asymptotic theory of the least square estimation for the continuous threshold ARMA model is further established. An application to the quarterly U.S. civilian unemployment rates data is given.
提出了一种基于分数的检验统计量来比较线性ARMA模型及其阈值扩展。重点研究了连续阈值模型在阈值处无跳跃的阈值效应。值得注意的是,虽然为连续阈值模型开发,但所提出的测试对于不连续的情况仍然有效。所提出的检验不需要在备择假设下拟合模型,使其在计算上比准似然比检验更有效。在零假设和局部替代条件下,导出了基于分数的检验统计量的渐近分布。仿真结果表明,该方法比拟似然比检验具有更好的规模,比拉格朗日乘数检验具有更强的有效性。进一步建立了连续阈值ARMA模型的最小二乘估计渐近理论。给出了美国季度平民失业率数据的应用程序。
{"title":"A score-based threshold effect test in time series models","authors":"Shufang Wei ,&nbsp;Yaping Deng ,&nbsp;Yaxing Yang","doi":"10.1016/j.csda.2025.108236","DOIUrl":"10.1016/j.csda.2025.108236","url":null,"abstract":"<div><div>A score-based test statistic is developed to compare a linear ARMA model with its threshold extension. In particular, the focus is on testing the threshold effect in continuous threshold models with no jump at the threshold. Notably, while developed for continuous threshold models, the proposed test remains effective for discontinuous cases. The proposed test does not require fitting the model under the alternative hypothesis, making it computationally more efficient than the quasi-likelihood ratio test. The asymptotic distributions of the score-based test statistic are derived under both the null hypothesis and local alternatives. Simulations indicate that the proposed test has better size than the quasi-likelihood ratio test and demonstrates stronger power compared to the Lagrange Multiplier test. The asymptotic theory of the least square estimation for the continuous threshold ARMA model is further established. An application to the quarterly U.S. civilian unemployment rates data is given.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108236"},"PeriodicalIF":1.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144491125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian selection approach for categorical responses via multinomial probit models 基于多项概率模型的分类响应贝叶斯选择方法
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-20 DOI: 10.1016/j.csda.2025.108233
Chi-Hsiang Chu , Kuo-Jung Lee , Chien-Chin Hsu , Ray-Bing Chen
A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.
提出了一个多项概率模型来检验分类响应变量,其主要目标是识别模型中的影响变量。为此,贝叶斯选择技术采用了两个层次指标。第一个指标表示变量与分类反应的相关性,随后的指标与变量在特定分类水平上的重要性有关,这有助于评估其在该水平上的影响。选择过程依赖于通过MCMC算法生成的后验指标样本。我们的贝叶斯选择策略的有效性通过模拟和应用到一个现实世界的例子来证明。
{"title":"Bayesian selection approach for categorical responses via multinomial probit models","authors":"Chi-Hsiang Chu ,&nbsp;Kuo-Jung Lee ,&nbsp;Chien-Chin Hsu ,&nbsp;Ray-Bing Chen","doi":"10.1016/j.csda.2025.108233","DOIUrl":"10.1016/j.csda.2025.108233","url":null,"abstract":"<div><div>A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108233"},"PeriodicalIF":1.5,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing approximate modular Bayesian inference by emulating the conditional posterior 通过模拟条件后验增强近似模贝叶斯推理
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-20 DOI: 10.1016/j.csda.2025.108235
Grant Hutchings , Kellin N. Rumsey , Derek Bingham , Gabriel Huerta
In modular Bayesian analyses, complex models are composed of distinct modules, each representing different aspects of the data or prior information. In this context, fully Bayesian approaches can sometimes lead to undesirable feedback between modules, compromising the integrity of the inference. The “cut-distribution” prevents unwanted influence between modules by “cutting” feedback. The direct sampling (DS) algorithm is standard practice for approximating the cut-distribution, but it can be computationally intensive, especially when the number of imputations required is large. An enhanced method is proposed, the Emulating the Conditional Posterior (ECP) algorithm, which leverages emulation to increase the number of imputations. Through numerical experiment it is demonstrated that the ECP algorithm outperforms the traditional DS approach in terms of accuracy and computational efficiency, particularly when resources are constrained. It is also shown how the DS algorithm can be improved using ideas from design of experiments. Some practical recommendations are given for algorithm choice in modular Bayesian analyses.
在模块化贝叶斯分析中,复杂模型由不同的模块组成,每个模块代表数据或先验信息的不同方面。在这种情况下,完全贝叶斯方法有时会导致模块之间的不良反馈,从而损害推理的完整性。“切割分布”通过“切割”反馈防止模块之间不必要的影响。直接抽样(DS)算法是近似cut-distribution的标准做法,但它可能是计算密集的,特别是当所需的输入数量很大时。提出了一种增强的方法,即模拟条件后验(ECP)算法,该算法利用仿真来增加插值次数。通过数值实验证明,ECP算法在精度和计算效率方面优于传统的DS方法,特别是在资源受限的情况下。本文还展示了如何利用实验设计的思想来改进DS算法。给出了模块化贝叶斯分析中算法选择的一些实用建议。
{"title":"Enhancing approximate modular Bayesian inference by emulating the conditional posterior","authors":"Grant Hutchings ,&nbsp;Kellin N. Rumsey ,&nbsp;Derek Bingham ,&nbsp;Gabriel Huerta","doi":"10.1016/j.csda.2025.108235","DOIUrl":"10.1016/j.csda.2025.108235","url":null,"abstract":"<div><div>In modular Bayesian analyses, complex models are composed of distinct modules, each representing different aspects of the data or prior information. In this context, fully Bayesian approaches can sometimes lead to undesirable feedback between modules, compromising the integrity of the inference. The “cut-distribution” prevents unwanted influence between modules by “cutting” feedback. The direct sampling (DS) algorithm is standard practice for approximating the cut-distribution, but it can be computationally intensive, especially when the number of imputations required is large. An enhanced method is proposed, the Emulating the Conditional Posterior (ECP) algorithm, which leverages emulation to increase the number of imputations. Through numerical experiment it is demonstrated that the ECP algorithm outperforms the traditional DS approach in terms of accuracy and computational efficiency, particularly when resources are constrained. It is also shown how the DS algorithm can be improved using ideas from design of experiments. Some practical recommendations are given for algorithm choice in modular Bayesian analyses.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108235"},"PeriodicalIF":1.5,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-based clustering for covariance matrices via penalized Wishart mixture models 基于模型的基于惩罚Wishart混合模型的协方差矩阵聚类
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-20 DOI: 10.1016/j.csda.2025.108232
Andrea Cappozzo , Alessandro Casa
Covariance matrices provide a valuable source of information about complex interactions and dependencies within the data. However, from a clustering perspective, this information has often been underutilized and overlooked. Indeed, commonly adopted distance-based approaches tend to rely primarily on mean levels to characterize and differentiate between groups. Recently, there have been promising efforts to cluster covariance matrices directly, thereby distinguishing groups solely based on the relationships between variables. From a model-based perspective, a probabilistic formalization has been provided by considering a mixture model with component densities following a Wishart distribution. Notwithstanding, this approach faces challenges when dealing with a large number of variables, as the number of parameters to be estimated increases quadratically. To address this issue, a sparse Wishart mixture model is proposed, which assumes that the component scale matrices possess a cluster-dependent degree of sparsity. Model estimation is performed by maximizing a penalized log-likelihood, enforcing a covariance graphical lasso penalty on the component scale matrices. This penalty not only reduces the number of non-zero parameters, mitigating the challenges of high-dimensional settings, but also enhances the interpretability of results by emphasizing the most relevant relationships among variables. The proposed methodology is tested on both simulated and real data, demonstrating its ability to unravel the complexities of neuroimaging data and effectively cluster subjects based on the relational patterns among distinct brain regions.
协方差矩阵提供了有关数据中复杂交互和依赖关系的有价值的信息源。然而,从集群的角度来看,这些信息往往没有得到充分利用和忽视。事实上,通常采用的基于距离的方法往往主要依赖于平均水平来表征和区分群体。最近,直接聚类协方差矩阵,从而仅根据变量之间的关系来区分组,已经有了很有希望的努力。从基于模型的角度来看,通过考虑组件密度遵循Wishart分布的混合模型,提供了概率形式化。然而,这种方法在处理大量变量时面临挑战,因为要估计的参数数量呈二次增长。为了解决这一问题,提出了一种稀疏的Wishart混合模型,该模型假设组件尺度矩阵具有簇依赖的稀疏度。模型估计是通过最大化惩罚对数似然来执行的,在分量尺度矩阵上强制执行协方差图形套索惩罚。这种惩罚不仅减少了非零参数的数量,减轻了高维设置的挑战,而且通过强调变量之间最相关的关系,增强了结果的可解释性。所提出的方法在模拟和真实数据上进行了测试,证明了它能够揭示神经成像数据的复杂性,并根据不同大脑区域之间的关系模式有效地聚类受试者。
{"title":"Model-based clustering for covariance matrices via penalized Wishart mixture models","authors":"Andrea Cappozzo ,&nbsp;Alessandro Casa","doi":"10.1016/j.csda.2025.108232","DOIUrl":"10.1016/j.csda.2025.108232","url":null,"abstract":"<div><div>Covariance matrices provide a valuable source of information about complex interactions and dependencies within the data. However, from a clustering perspective, this information has often been underutilized and overlooked. Indeed, commonly adopted distance-based approaches tend to rely primarily on mean levels to characterize and differentiate between groups. Recently, there have been promising efforts to cluster covariance matrices directly, thereby distinguishing groups solely based on the relationships between variables. From a model-based perspective, a probabilistic formalization has been provided by considering a mixture model with component densities following a Wishart distribution. Notwithstanding, this approach faces challenges when dealing with a large number of variables, as the number of parameters to be estimated increases quadratically. To address this issue, a sparse Wishart mixture model is proposed, which assumes that the component scale matrices possess a cluster-dependent degree of sparsity. Model estimation is performed by maximizing a penalized log-likelihood, enforcing a covariance graphical lasso penalty on the component scale matrices. This penalty not only reduces the number of non-zero parameters, mitigating the challenges of high-dimensional settings, but also enhances the interpretability of results by emphasizing the most relevant relationships among variables. The proposed methodology is tested on both simulated and real data, demonstrating its ability to unravel the complexities of neuroimaging data and effectively cluster subjects based on the relational patterns among distinct brain regions.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108232"},"PeriodicalIF":1.5,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint estimation of precision matrices for long-memory time series 长记忆时间序列精度矩阵的联合估计
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-19 DOI: 10.1016/j.csda.2025.108234
Qihu Zhang , Jongik Chung , Cheolwoo Park
Methods are proposed for estimating multiple precision matrices for long-memory time series, with particular emphasis on the analysis of resting-state functional magnetic resonance imaging (fMRI) data obtained from multiple subjects. The objective is to estimate both individual brain networks and a common structure representative of a group. Several approaches employing weighted aggregation are introduced to simultaneously estimate individual and group-level precision matrices. Convergence rates of the estimators are examined under various norms and expectations, and their performance is evaluated under both sub-Gaussian and heavy-tailed distributions. The proposed methods are demonstrated through simulated data and real resting-state fMRI datasets.
提出了估计长记忆时间序列的多个精度矩阵的方法,重点分析了从多个受试者获得的静息状态功能磁共振成像(fMRI)数据。目的是估计个体大脑网络和代表群体的共同结构。介绍了几种采用加权聚合的方法来同时估计个体和群体级精度矩阵。在各种规范和期望下检验了估计器的收敛速度,并在亚高斯分布和重尾分布下评估了它们的性能。通过模拟数据和真实静息状态fMRI数据集验证了所提出的方法。
{"title":"Joint estimation of precision matrices for long-memory time series","authors":"Qihu Zhang ,&nbsp;Jongik Chung ,&nbsp;Cheolwoo Park","doi":"10.1016/j.csda.2025.108234","DOIUrl":"10.1016/j.csda.2025.108234","url":null,"abstract":"<div><div>Methods are proposed for estimating multiple precision matrices for long-memory time series, with particular emphasis on the analysis of resting-state functional magnetic resonance imaging (fMRI) data obtained from multiple subjects. The objective is to estimate both individual brain networks and a common structure representative of a group. Several approaches employing weighted aggregation are introduced to simultaneously estimate individual and group-level precision matrices. Convergence rates of the estimators are examined under various norms and expectations, and their performance is evaluated under both sub-Gaussian and heavy-tailed distributions. The proposed methods are demonstrated through simulated data and real resting-state fMRI datasets.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108234"},"PeriodicalIF":1.5,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference on a stochastic SIR model including growth curves 包含生长曲线的随机SIR模型的推论
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-16 DOI: 10.1016/j.csda.2025.108231
Giuseppina Albano , Virginia Giorno , Gema Pérez-Romero , Francisco de Asis Torres-Ruiz
A Susceptible-Infected-Removed stochastic model is presented, in which the stochasticity is introduced through two independent Brownian motions in the dynamics of the Susceptible and Infected populations. To account for the natural evolution of the Susceptible population, a growth function is considered in which size is influenced by the birth and death of individuals. Inference for such a model is addressed by means of a Quasi Maximum Likelihood Estimation (QMLE) method. The resulting nonlinear system can be numerically solved by iterative procedures. A technique to obtain the initial solutions usually required by such methods is also provided. Finally, simulation studies are performed for three well-known growth functions, namely Gompertz, Logistic and Bertalanffy curves. The performance of the initial estimates of the involved parameters is assessed, and the goodness of the proposed methodology is evaluated.
提出了一种易感-感染-去除随机模型,该模型通过易感种群和感染种群动力学中的两个独立布朗运动引入随机性。为了解释易感群体的自然进化,考虑了一个生长函数,其中大小受个体出生和死亡的影响。利用拟极大似然估计(Quasi Maximum Likelihood Estimation, QMLE)方法解决了该模型的推理问题。所得到的非线性系统可以通过迭代过程进行数值求解。本文还提供了一种获得这些方法通常需要的初始解的技术。最后,对Gompertz曲线、Logistic曲线和Bertalanffy曲线这三种著名的生长函数进行了仿真研究。评估了所涉及参数的初始估计的性能,并评估了所提出方法的优点。
{"title":"Inference on a stochastic SIR model including growth curves","authors":"Giuseppina Albano ,&nbsp;Virginia Giorno ,&nbsp;Gema Pérez-Romero ,&nbsp;Francisco de Asis Torres-Ruiz","doi":"10.1016/j.csda.2025.108231","DOIUrl":"10.1016/j.csda.2025.108231","url":null,"abstract":"<div><div>A Susceptible-Infected-Removed stochastic model is presented, in which the stochasticity is introduced through two independent Brownian motions in the dynamics of the Susceptible and Infected populations. To account for the natural evolution of the Susceptible population, a growth function is considered in which size is influenced by the birth and death of individuals. Inference for such a model is addressed by means of a Quasi Maximum Likelihood Estimation (QMLE) method. The resulting nonlinear system can be numerically solved by iterative procedures. A technique to obtain the initial solutions usually required by such methods is also provided. Finally, simulation studies are performed for three well-known growth functions, namely Gompertz, Logistic and Bertalanffy curves. The performance of the initial estimates of the involved parameters is assessed, and the goodness of the proposed methodology is evaluated.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108231"},"PeriodicalIF":1.5,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving communication-efficient spectral clustering for distributed multiple networks 分布式多网络的保密性通信高效频谱聚类
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-09 DOI: 10.1016/j.csda.2025.108230
Shanghao Wu , Xiao Guo , Hai Zhang
Multi-layer networks arise naturally in various scientific domains including social sciences, biology, neuroscience, among others. The network layers of a given multi-layer network are commonly stored in a local and distributed fashion because of the privacy, ownership, and communication costs. The literature on community detection based on these data is still limited. This paper proposes a new distributed spectral clustering-based algorithm for consensus community detection of the locally stored multi-layer network. The algorithm is based on the power method. It is communication-efficient by allowing multiple local power iterations before aggregation; and privacy-preserving by incorporating the notion of differential privacy. The convergence rate of the proposed algorithm is studied under the assumption that the multi-layer networks are generated from the multi-layer stochastic block models. Numerical studies show the superior performance of the proposed algorithm over competitive algorithms.
多层网络自然出现在各种科学领域,包括社会科学、生物学、神经科学等。考虑到隐私、所有权和通信成本,给定多层网络的网络层通常以本地和分布式方式存储。基于这些数据的社区检测文献仍然有限。本文提出了一种新的基于分布式谱聚类的局部存储多层网络共识团体检测算法。该算法基于幂次法。它允许在聚合之前进行多次本地功率迭代,从而提高了通信效率;通过结合差分隐私的概念来保护隐私。在多层随机块模型生成多层网络的假设下,研究了该算法的收敛速度。数值研究表明,该算法的性能优于竞争算法。
{"title":"Privacy-preserving communication-efficient spectral clustering for distributed multiple networks","authors":"Shanghao Wu ,&nbsp;Xiao Guo ,&nbsp;Hai Zhang","doi":"10.1016/j.csda.2025.108230","DOIUrl":"10.1016/j.csda.2025.108230","url":null,"abstract":"<div><div>Multi-layer networks arise naturally in various scientific domains including social sciences, biology, neuroscience, among others. The network layers of a given multi-layer network are commonly stored in a local and distributed fashion because of the privacy, ownership, and communication costs. The literature on community detection based on these data is still limited. This paper proposes a new distributed spectral clustering-based algorithm for consensus community detection of the locally stored multi-layer network. The algorithm is based on the power method. It is communication-efficient by allowing multiple local power iterations before aggregation; and privacy-preserving by incorporating the notion of differential privacy. The convergence rate of the proposed algorithm is studied under the assumption that the multi-layer networks are generated from the multi-layer stochastic block models. Numerical studies show the superior performance of the proposed algorithm over competitive algorithms.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108230"},"PeriodicalIF":1.5,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144261609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible modeling of left-truncated and interval-censored competing risks data with missing event types 具有缺失事件类型的左截尾和区间截尾竞争风险数据的灵活建模
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-05 DOI: 10.1016/j.csda.2025.108229
Yichen Lou , Yuqing Ma , Liming Xiang , Jianguo Sun
Interval-censored competing risks data arise in many cohort studies in clinical research, where multiple types of events subject to interval censoring are included and the occurrence of the primary event of interest may be censored by the occurrence of other events. The presence of missing event types and left truncation poses challenges to the regression analysis of such data. We propose a new two-stage estimation procedure under a class of semiparametric generalized odds rate transformation models to overcome these challenges. Our method first facilitates the estimation of both the probability of response and the probability of occurrence of each type of event under the missing at random assumption, using either parametric or non-parametric methods. An augmented inverse probability weighting likelihood based on the complete-case likelihood and data from subjects with missing type of event is then maximized for estimating regression parameters. We provide desirable asymptotic properties and construct a concordance index to evaluate the model's discriminative ability. The proposed method is demonstrated through extensive simulations and the analysis of data from the Amsterdam cohort study on HIV infection and AIDS.
间隔审查竞争风险数据出现在临床研究中的许多队列研究中,其中包括受间隔审查的多种类型的事件,并且主要感兴趣事件的发生可能被其他事件的发生所审查。缺失事件类型和左截断的存在对此类数据的回归分析提出了挑战。为了克服这些挑战,我们在一类半参数广义比值率变换模型下提出了一种新的两阶段估计方法。我们的方法首先使用参数或非参数方法,便于在随机缺失假设下估计响应概率和每种事件发生的概率。然后,基于完全案例似然和缺失事件类型的受试者数据的增广逆概率加权似然最大化用于估计回归参数。我们给出了理想的渐近性质,并构造了一个一致性指标来评价模型的判别能力。提出的方法是通过广泛的模拟和数据分析从阿姆斯特丹队列研究艾滋病毒感染和艾滋病证明。
{"title":"Flexible modeling of left-truncated and interval-censored competing risks data with missing event types","authors":"Yichen Lou ,&nbsp;Yuqing Ma ,&nbsp;Liming Xiang ,&nbsp;Jianguo Sun","doi":"10.1016/j.csda.2025.108229","DOIUrl":"10.1016/j.csda.2025.108229","url":null,"abstract":"<div><div>Interval-censored competing risks data arise in many cohort studies in clinical research, where multiple types of events subject to interval censoring are included and the occurrence of the primary event of interest may be censored by the occurrence of other events. The presence of missing event types and left truncation poses challenges to the regression analysis of such data. We propose a new two-stage estimation procedure under a class of semiparametric generalized odds rate transformation models to overcome these challenges. Our method first facilitates the estimation of both the probability of response and the probability of occurrence of each type of event under the missing at random assumption, using either parametric or non-parametric methods. An augmented inverse probability weighting likelihood based on the complete-case likelihood and data from subjects with missing type of event is then maximized for estimating regression parameters. We provide desirable asymptotic properties and construct a concordance index to evaluate the model's discriminative ability. The proposed method is demonstrated through extensive simulations and the analysis of data from the Amsterdam cohort study on HIV infection and AIDS.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108229"},"PeriodicalIF":1.5,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144242893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Region detection and image clustering via sparse Kronecker product decomposition 基于稀疏Kronecker积分解的区域检测与图像聚类
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-03 DOI: 10.1016/j.csda.2025.108226
Guang Yang , Long Feng
Image clustering is usually conducted by vectorizing image pixels, treating them as independent, and applying classical clustering approaches to the obtained features. However, as image data is often of high-dimensional and contains rich spatial information, such treatment is far from satisfactory. For medical image data, another important characteristic is the region-wise sparseness in signals. That is to say, there are only a few unknown regions in the medical image that differentiate the images associated with different groups of patients, while other regions are uninformative. Accurately detecting these informative regions would not only improve clustering accuracy, more importantly, it would also provide interpretations for the rationale behind them. Motivated by the need to identify significant regions of interest, we propose a general framework named Image Clustering via Sparse Kronecker Product Decomposition (IC-SKPD). This framework aims to simultaneously divide samples into clusters and detect regions that are informative for clustering. Our framework is general in the sense that it provides a unified treatment for matrix and tensor-valued samples. An iterative hard-thresholded singular value decomposition approach is developed to solve this model. Theoretically, the IC-SKPD enjoys guarantees for clustering accuracy and region detection consistency under mild conditions on the minimum signals. Comprehensive simulations along with real data analysis further validate the superior performance of IC-SKPD on clustering and region detection.
图像聚类通常是通过对图像像素进行矢量化,将它们视为独立的,然后对得到的特征应用经典聚类方法进行聚类。然而,由于图像数据往往是高维的,并且包含了丰富的空间信息,这样的处理是远远不能令人满意的。对于医学图像数据,另一个重要的特征是信号的区域稀疏性。也就是说,医学图像中只有少数未知区域能够区分与不同患者组相关的图像,而其他区域是无信息的。准确地检测这些信息区域不仅可以提高聚类的准确性,更重要的是,它还可以为它们背后的原理提供解释。由于需要识别感兴趣的重要区域,我们提出了一个通用框架,称为通过稀疏Kronecker积分解(IC-SKPD)的图像聚类。该框架旨在同时将样本划分为簇,并检测用于聚类的信息区域。我们的框架是通用的,因为它提供了对矩阵和张量值样本的统一处理。提出了一种迭代硬阈值奇异值分解方法来求解该模型。理论上,IC-SKPD在最小信号的温和条件下保证了聚类精度和区域检测一致性。综合仿真和实际数据分析进一步验证了IC-SKPD在聚类和区域检测方面的优越性能。
{"title":"Region detection and image clustering via sparse Kronecker product decomposition","authors":"Guang Yang ,&nbsp;Long Feng","doi":"10.1016/j.csda.2025.108226","DOIUrl":"10.1016/j.csda.2025.108226","url":null,"abstract":"<div><div>Image clustering is usually conducted by vectorizing image pixels, treating them as independent, and applying classical clustering approaches to the obtained features. However, as image data is often of high-dimensional and contains rich spatial information, such treatment is far from satisfactory. For medical image data, another important characteristic is the region-wise sparseness in signals. That is to say, there are only a few unknown regions in the medical image that differentiate the images associated with different groups of patients, while other regions are uninformative. Accurately detecting these informative regions would not only improve clustering accuracy, more importantly, it would also provide interpretations for the rationale behind them. Motivated by the need to identify significant regions of interest, we propose a general framework named Image Clustering via Sparse Kronecker Product Decomposition (IC-SKPD). This framework aims to simultaneously divide samples into clusters and detect regions that are informative for clustering. Our framework is general in the sense that it provides a unified treatment for matrix and tensor-valued samples. An iterative hard-thresholded singular value decomposition approach is developed to solve this model. Theoretically, the IC-SKPD enjoys guarantees for clustering accuracy and region detection consistency under mild conditions on the minimum signals. Comprehensive simulations along with real data analysis further validate the superior performance of IC-SKPD on clustering and region detection.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108226"},"PeriodicalIF":1.5,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144242892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics & Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1