首页 > 最新文献

Computational Statistics & Data Analysis最新文献

英文 中文
Testing the equality of high dimensional distributions 测试高维分布的等式
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-09 DOI: 10.1016/j.csda.2025.108245
Reza Modarres
The Euclidean distance is not a suitable distance for high dimensional settings due to the distance concentration phenomenon. A novel statistic that is inspired by the interpoint distances, but avoids their computation, is proposed for comparing and visualizing high dimensional datasets. The new statistic is based on a high dimensional dissimilarity index that takes advantage of the concentration phenomenon. A simultaneous display of observations means and standard deviations that aids visualization, detection of suspect outliers, and enhances separability among the competing classes in the transformed space is discussed. The finite sample convergence of the dissimilarity indices is studied, nine statistics are compared under several distributions, and three applications are presented.
由于距离集中现象的存在,欧几里得距离不是一个适合高维环境的距离。提出了一种新的统计量,该统计量受点间距离的启发,但避免了点间距离的计算,用于比较和可视化高维数据集。新的统计是基于一个高维的不相似指数,利用了集中现象。同时显示观测均值和标准偏差,有助于可视化、可疑异常值的检测,并增强转换空间中竞争类之间的可分离性。研究了不相似度指标的有限样本收敛性,比较了几种分布下的9种统计量,并给出了3种应用。
{"title":"Testing the equality of high dimensional distributions","authors":"Reza Modarres","doi":"10.1016/j.csda.2025.108245","DOIUrl":"10.1016/j.csda.2025.108245","url":null,"abstract":"<div><div>The Euclidean distance is not a suitable distance for high dimensional settings due to the distance concentration phenomenon. A novel statistic that is inspired by the interpoint distances, but avoids their computation, is proposed for comparing and visualizing high dimensional datasets. The new statistic is based on a high dimensional dissimilarity index that takes advantage of the concentration phenomenon. A simultaneous display of observations means and standard deviations that aids visualization, detection of suspect outliers, and enhances separability among the competing classes in the transformed space is discussed. The finite sample convergence of the dissimilarity indices is studied, nine statistics are compared under several distributions, and three applications are presented.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108245"},"PeriodicalIF":1.5,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-dimensional response growth curve modeling for longitudinal neuroimaging analysis 纵向神经成像分析的高维反应增长曲线建模
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-07 DOI: 10.1016/j.csda.2025.108239
Lu Wang , Xiang Lyu , Lexin Li
There is increasing interest in modeling high-dimensional longitudinal outcomes in applications such as developmental neuroimaging research. Growth curve model offers a useful tool to capture both the mean growth pattern across individuals, as well as the dynamic changes of outcomes over time within each individual. However, when the number of outcomes is large, it becomes challenging and often infeasible to tackle the large covariance matrix of the random effects involved in the model. A high-dimensional response growth curve model, with three novel components, is proposed: a low-rank factor model structure that substantially reduces the number of parameters in the large covariance matrix, a re-parameterization formulation coupled with a sparsity penalty that selects important fixed and random effect terms, and a computational trick that turns the inversion of a large matrix into the inversion of a stack of small matrices and thus considerably speeds up the computation. An efficient expectation-maximization-type estimation algorithm is developed, and the competitive performance of the proposed method is demonstrated through both simulations and a longitudinal study of brain structural connectivity in association with human immunodeficiency virus.
在诸如发育神经成像研究等应用中,对高维纵向结果建模的兴趣越来越大。增长曲线模型提供了一个有用的工具,既可以捕捉个体之间的平均增长模式,也可以捕捉每个个体内部结果随时间的动态变化。然而,当结果数量很大时,处理模型中涉及的随机效应的大协方差矩阵就变得具有挑战性,而且往往是不可行的。提出了一种高维响应增长曲线模型,具有三个新的组成部分:一个低秩因子模型结构,它大大减少了大协方差矩阵中参数的数量;一个再参数化公式,加上选择重要的固定和随机效应项的稀疏性惩罚;一个计算技巧,它将一个大矩阵的反演转化为一堆小矩阵的反演,从而大大加快了计算速度。开发了一种高效的期望最大化型估计算法,并通过模拟和与人类免疫缺陷病毒相关的大脑结构连接的纵向研究证明了所提出方法的竞争性能。
{"title":"High-dimensional response growth curve modeling for longitudinal neuroimaging analysis","authors":"Lu Wang ,&nbsp;Xiang Lyu ,&nbsp;Lexin Li","doi":"10.1016/j.csda.2025.108239","DOIUrl":"10.1016/j.csda.2025.108239","url":null,"abstract":"<div><div>There is increasing interest in modeling high-dimensional longitudinal outcomes in applications such as developmental neuroimaging research. Growth curve model offers a useful tool to capture both the mean growth pattern across individuals, as well as the dynamic changes of outcomes over time within each individual. However, when the number of outcomes is large, it becomes challenging and often infeasible to tackle the large covariance matrix of the random effects involved in the model. A high-dimensional response growth curve model, with three novel components, is proposed: a low-rank factor model structure that substantially reduces the number of parameters in the large covariance matrix, a re-parameterization formulation coupled with a sparsity penalty that selects important fixed and random effect terms, and a computational trick that turns the inversion of a large matrix into the inversion of a stack of small matrices and thus considerably speeds up the computation. An efficient expectation-maximization-type estimation algorithm is developed, and the competitive performance of the proposed method is demonstrated through both simulations and a longitudinal study of brain structural connectivity in association with human immunodeficiency virus.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108239"},"PeriodicalIF":1.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneously detecting spatiotemporal changes with penalized Poisson regression models 用惩罚泊松回归模型同时检测时空变化
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-07 DOI: 10.1016/j.csda.2025.108240
Zerui Zhang , Xin Wang , Xin Zhang , Jing Zhang
In the realm of large-scale spatiotemporal data, abrupt changes are commonly occurring across both spatial and temporal domains. To address the concurrent challenges of detecting change points and identifying spatial clusters within spatiotemporal count data, an innovative method is introduced based on the Poisson regression model. The proposed method employs doubly fused penalization to unveil the underlying spatiotemporal change patterns. To efficiently estimate the model, an iterative shrinkage and threshold based algorithm is developed to minimize the doubly penalized likelihood function. The reliability and accuracy is confirmed by the statistical consistency properties. Furthermore, extensive numerical experiments are conducted to validate the theoretical findings, thereby highlighting the superior performance of the proposed method when compared to existing competitive approaches.
在大尺度时空数据领域,突变通常发生在空间和时间域。为了解决在时空计数数据中检测变化点和识别空间簇的同时挑战,提出了一种基于泊松回归模型的创新方法。该方法采用双重融合惩罚来揭示潜在的时空变化模式。为了有效地估计模型,提出了一种基于迭代收缩和阈值的算法来最小化双重惩罚的似然函数。统计一致性证明了该方法的可靠性和准确性。此外,还进行了大量的数值实验来验证理论发现,从而突出了与现有竞争方法相比所提出方法的优越性能。
{"title":"Simultaneously detecting spatiotemporal changes with penalized Poisson regression models","authors":"Zerui Zhang ,&nbsp;Xin Wang ,&nbsp;Xin Zhang ,&nbsp;Jing Zhang","doi":"10.1016/j.csda.2025.108240","DOIUrl":"10.1016/j.csda.2025.108240","url":null,"abstract":"<div><div>In the realm of large-scale spatiotemporal data, abrupt changes are commonly occurring across both spatial and temporal domains. To address the concurrent challenges of detecting change points and identifying spatial clusters within spatiotemporal count data, an innovative method is introduced based on the Poisson regression model. The proposed method employs doubly fused penalization to unveil the underlying spatiotemporal change patterns. To efficiently estimate the model, an iterative shrinkage and threshold based algorithm is developed to minimize the doubly penalized likelihood function. The reliability and accuracy is confirmed by the statistical consistency properties. Furthermore, extensive numerical experiments are conducted to validate the theoretical findings, thereby highlighting the superior performance of the proposed method when compared to existing competitive approaches.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108240"},"PeriodicalIF":1.5,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New tests for the identity and sphericity of high-dimensional covariance matrices via U-statistics 用u统计量检验高维协方差矩阵的恒等性和球性
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-05 DOI: 10.1016/j.csda.2025.108242
Xiaoge Xiong
Two novel test procedures are proposed for the identity and sphericity of covariance matrices in high-dimensional asymptotic frameworks, both constructed via U-statistics. The limiting distributions of these tests are established under null and local alternative hypotheses. Monte Carlo simulation results demonstrate their superiority over several competing methods across various scenarios, with the proposed tests achieving full power against both dense and sparse alternatives. The effectiveness of the proposed tests is further validated through an application to a colon dataset.
提出了两种新的检验方法来检验高维渐近框架中协方差矩阵的恒等性和球性,这两种检验方法都是用u统计量构造的。这些检验的极限分布是在零假设和局部备用假设下建立的。蒙特卡罗仿真结果表明,在各种情况下,该方法优于几种竞争方法,所提出的测试在密集和稀疏替代方案下都能达到全功率。通过对冒号数据集的应用程序进一步验证了所建议测试的有效性。
{"title":"New tests for the identity and sphericity of high-dimensional covariance matrices via U-statistics","authors":"Xiaoge Xiong","doi":"10.1016/j.csda.2025.108242","DOIUrl":"10.1016/j.csda.2025.108242","url":null,"abstract":"<div><div>Two novel test procedures are proposed for the identity and sphericity of covariance matrices in high-dimensional asymptotic frameworks, both constructed via U-statistics. The limiting distributions of these tests are established under null and local alternative hypotheses. Monte Carlo simulation results demonstrate their superiority over several competing methods across various scenarios, with the proposed tests achieving full power against both dense and sparse alternatives. The effectiveness of the proposed tests is further validated through an application to a colon dataset.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108242"},"PeriodicalIF":1.5,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-dimensional and banded integer-valued autoregressive processes 高维带整数值自回归过程
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-04 DOI: 10.1016/j.csda.2025.108243
Nuo Xu, Kai Yang
The modeling of high-dimensional time series has always been an appealing and challenging problem. The main difficulties of modeling high-dimensional time series lie in the curse of dimensionality and complex cross dependence between adjacent components. To solve these problems for high-dimensional time series of counts, a class of high-dimensional and banded integer-valued autoregressive processes without assuming the innovation's distribution is proposed. A banded thinning structure is constructed to diminish the parameters' dimension. The componentwise conditional least squares and weighted conditional least squares methods are developed to estimate the banded autoregressive coefficient matrices. The bandwidth parameter is identified via a marginal Bayesian information criterion method. Some numerical results are provided to show the good performance of the estimators. Finally, the superiority of the proposed model is shown by an application to an air quality data set of different cities.
高维时间序列的建模一直是一个具有吸引力和挑战性的问题。高维时间序列建模的主要困难在于维度的诅咒和相邻分量之间复杂的交叉依赖。为了解决高维计数时间序列的这些问题,提出了一类不假设创新分布的高维带状整值自回归过程。采用带状减薄结构减小参数尺寸。提出了组合条件最小二乘法和加权条件最小二乘法来估计带状自回归系数矩阵。利用边际贝叶斯信息准则识别带宽参数。数值结果表明了该估计器的良好性能。最后,通过对不同城市空气质量数据集的应用,证明了该模型的优越性。
{"title":"High-dimensional and banded integer-valued autoregressive processes","authors":"Nuo Xu,&nbsp;Kai Yang","doi":"10.1016/j.csda.2025.108243","DOIUrl":"10.1016/j.csda.2025.108243","url":null,"abstract":"<div><div>The modeling of high-dimensional time series has always been an appealing and challenging problem. The main difficulties of modeling high-dimensional time series lie in the curse of dimensionality and complex cross dependence between adjacent components. To solve these problems for high-dimensional time series of counts, a class of high-dimensional and banded integer-valued autoregressive processes without assuming the innovation's distribution is proposed. A banded thinning structure is constructed to diminish the parameters' dimension. The componentwise conditional least squares and weighted conditional least squares methods are developed to estimate the banded autoregressive coefficient matrices. The bandwidth parameter is identified via a marginal Bayesian information criterion method. Some numerical results are provided to show the good performance of the estimators. Finally, the superiority of the proposed model is shown by an application to an air quality data set of different cities.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108243"},"PeriodicalIF":1.5,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional inference for ultrahigh-dimensional additive hazards model 超高维加性危险模型的条件推理
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-04 DOI: 10.1016/j.csda.2025.108244
Meiling Hao , Ruiyu Yang , Fangfang Bai , Liuquan Sun
In the realm of high-throughput genomic data, modeling with ultrahigh-dimensional covariates and censored survival outcomes is of great importance. We conduct conditional inference for the ultrahigh-dimensional additive hazards model, allowing both the covariates of interest and nuisance covariates to be ultrahigh-dimensional. The presence of right censorship with survival outcomes adds an extra layer of complexity to the original data structure, posing significant challenges for the ultrahigh-dimensional additive hazards model. To address this, we introduce an innovative test statistic based on the quadratic norm of the score function. Moreover, when there is a high correlation between the covariates of interest and nuisance covariates, we propose a decorrelated score function-based test statistic to enhance statistical power. Additionally, we establish the limiting distributions of the test statistics under both the null and local alternative hypotheses, further enhancing the computational appeal of our approach. The proposed statistics are thoroughly evaluated through extensive simulation studies and applied to two real data examples.
在高通量基因组数据领域,使用超高维协变量和截尾生存结果进行建模非常重要。我们对超高维的加性危害模型进行条件推理,允许感兴趣的协变量和讨厌的协变量都是超高维的。带有生存结果的正确审查的存在给原始数据结构增加了额外的复杂性,给超高维加性风险模型带来了重大挑战。为了解决这个问题,我们引入了一个基于分数函数的二次范数的创新检验统计量。此外,当感兴趣的协变量和讨厌的协变量之间存在高度相关时,我们提出了一种基于去相关分数函数的检验统计量来提高统计能力。此外,我们在零假设和局部可选假设下建立了检验统计量的极限分布,进一步增强了我们方法的计算吸引力。通过广泛的模拟研究和应用于两个真实数据实例,对所提出的统计数据进行了彻底的评估。
{"title":"Conditional inference for ultrahigh-dimensional additive hazards model","authors":"Meiling Hao ,&nbsp;Ruiyu Yang ,&nbsp;Fangfang Bai ,&nbsp;Liuquan Sun","doi":"10.1016/j.csda.2025.108244","DOIUrl":"10.1016/j.csda.2025.108244","url":null,"abstract":"<div><div>In the realm of high-throughput genomic data, modeling with ultrahigh-dimensional covariates and censored survival outcomes is of great importance. We conduct conditional inference for the ultrahigh-dimensional additive hazards model, allowing both the covariates of interest and nuisance covariates to be ultrahigh-dimensional. The presence of right censorship with survival outcomes adds an extra layer of complexity to the original data structure, posing significant challenges for the ultrahigh-dimensional additive hazards model. To address this, we introduce an innovative test statistic based on the quadratic norm of the score function. Moreover, when there is a high correlation between the covariates of interest and nuisance covariates, we propose a decorrelated score function-based test statistic to enhance statistical power. Additionally, we establish the limiting distributions of the test statistics under both the null and local alternative hypotheses, further enhancing the computational appeal of our approach. The proposed statistics are thoroughly evaluated through extensive simulation studies and applied to two real data examples.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108244"},"PeriodicalIF":1.5,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pure interaction effects unseen by Random Forests 随机森林看不到的纯粹互动效果
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-07-01 DOI: 10.1016/j.csda.2025.108237
Ricardo Blum , Munir Hiabu , Enno Mammen , Joseph T. Meyer
Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during tree construction. Motivated from this, it is argued that simple alternative partitioning schemes used in the tree growing procedure can enhance identification of these interactions. In a simulation study these variants are compared to conventional Random Forests and Extremely Randomized Trees. The results validate that the modifications considered enhance the model's fitting ability in scenarios where pure interactions play a crucial role. Finally, the methods are applied to real datasets.
人们普遍认为随机森林可以很好地捕捉到相互作用。然而,一些简单的例子表明,它们在存在某些纯交互的情况下表现不佳,而传统的CART标准在树构建期间难以捕获这些交互。基于此,有人认为,在树木生长过程中使用的简单替代划分方案可以增强这些相互作用的识别。在模拟研究中,将这些变量与传统的随机森林和极度随机树进行了比较。结果证实,在纯交互作用起关键作用的情况下,所考虑的修改增强了模型的拟合能力。最后,将该方法应用于实际数据集。
{"title":"Pure interaction effects unseen by Random Forests","authors":"Ricardo Blum ,&nbsp;Munir Hiabu ,&nbsp;Enno Mammen ,&nbsp;Joseph T. Meyer","doi":"10.1016/j.csda.2025.108237","DOIUrl":"10.1016/j.csda.2025.108237","url":null,"abstract":"<div><div>Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during tree construction. Motivated from this, it is argued that simple alternative partitioning schemes used in the tree growing procedure can enhance identification of these interactions. In a simulation study these variants are compared to conventional Random Forests and Extremely Randomized Trees. The results validate that the modifications considered enhance the model's fitting ability in scenarios where pure interactions play a crucial role. Finally, the methods are applied to real datasets.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108237"},"PeriodicalIF":1.5,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection for spatio-temporal conditionally Poisson point processes 时空条件泊松点过程的变量选择
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-27 DOI: 10.1016/j.csda.2025.108238
Achmad Choiruddin , Jonatan A. González , Jorge Mateu , Alwan Fadlurohman , Rasmus Waagepetersen
Spatio-temporal point pattern data are becoming prevalent in many scientific disciplines. We consider a sequence of spatial point processes where each point process is Poisson given the past. We model the conditional first-order intensity function of each point process as a parametric log-linear function of spatial, temporal, and spatio-temporal covariates that may depend on previous point patterns. Dealing with spatio-temporal covariates brings computational and methodological challenges compared to the purely spatial case. We extend regularisation methods for spatial point process variable selection to obtain parsimonious and interpretable models in the considered spatio-temporal case. Using our proposed methodology, we conduct two simulation studies and examine an application to criminal activity in the Kennedy district of Bogota. In the application, we consider a spatio-temporal point pattern data of crime locations and a number of spatial, temporal, and spatio-temporal covariates related to urban places, environmental factors, and further space-time factors. The intensity function of vehicle thefts is estimated, considering other crimes as covariate information. The proposed methodology offers a comprehensive approach for analysing spatio-temporal point pattern crime data, capturing complex relationships between covariates and crime occurrences over space and time.
时空点模式数据在许多科学学科中越来越流行。我们考虑一个空间点过程序列,其中每个点过程都是给定过去的泊松过程。我们将每个点过程的条件一阶强度函数建模为空间、时间和时空协变量的参数对数线性函数,这些协变量可能依赖于先前的点模式。与纯粹的空间情况相比,处理时空协变量带来了计算和方法上的挑战。我们扩展了空间点过程变量选择的正则化方法,以在考虑的时空情况下获得简洁和可解释的模型。使用我们提出的方法,我们进行了两次模拟研究,并检查了波哥大肯尼迪区犯罪活动的应用。在应用程序中,我们考虑了犯罪地点的时空点模式数据以及与城市地点、环境因素和进一步的时空因素相关的一些空间、时间和时空协变量。考虑其他犯罪作为协变量信息,估计了车辆盗窃的强度函数。所提出的方法提供了一种全面的方法来分析时空点模式犯罪数据,捕捉协变量和犯罪事件之间的复杂关系。
{"title":"Variable selection for spatio-temporal conditionally Poisson point processes","authors":"Achmad Choiruddin ,&nbsp;Jonatan A. González ,&nbsp;Jorge Mateu ,&nbsp;Alwan Fadlurohman ,&nbsp;Rasmus Waagepetersen","doi":"10.1016/j.csda.2025.108238","DOIUrl":"10.1016/j.csda.2025.108238","url":null,"abstract":"<div><div>Spatio-temporal point pattern data are becoming prevalent in many scientific disciplines. We consider a sequence of spatial point processes where each point process is Poisson given the past. We model the conditional first-order intensity function of each point process as a parametric log-linear function of spatial, temporal, and spatio-temporal covariates that may depend on previous point patterns. Dealing with spatio-temporal covariates brings computational and methodological challenges compared to the purely spatial case. We extend regularisation methods for spatial point process variable selection to obtain parsimonious and interpretable models in the considered spatio-temporal case. Using our proposed methodology, we conduct two simulation studies and examine an application to criminal activity in the Kennedy district of Bogota. In the application, we consider a spatio-temporal point pattern data of crime locations and a number of spatial, temporal, and spatio-temporal covariates related to urban places, environmental factors, and further space-time factors. The intensity function of vehicle thefts is estimated, considering other crimes as covariate information. The proposed methodology offers a comprehensive approach for analysing spatio-temporal point pattern crime data, capturing complex relationships between covariates and crime occurrences over space and time.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108238"},"PeriodicalIF":1.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144535764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A score-based threshold effect test in time series models 时间序列模型中基于分数的阈值效应检验
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-25 DOI: 10.1016/j.csda.2025.108236
Shufang Wei , Yaping Deng , Yaxing Yang
A score-based test statistic is developed to compare a linear ARMA model with its threshold extension. In particular, the focus is on testing the threshold effect in continuous threshold models with no jump at the threshold. Notably, while developed for continuous threshold models, the proposed test remains effective for discontinuous cases. The proposed test does not require fitting the model under the alternative hypothesis, making it computationally more efficient than the quasi-likelihood ratio test. The asymptotic distributions of the score-based test statistic are derived under both the null hypothesis and local alternatives. Simulations indicate that the proposed test has better size than the quasi-likelihood ratio test and demonstrates stronger power compared to the Lagrange Multiplier test. The asymptotic theory of the least square estimation for the continuous threshold ARMA model is further established. An application to the quarterly U.S. civilian unemployment rates data is given.
提出了一种基于分数的检验统计量来比较线性ARMA模型及其阈值扩展。重点研究了连续阈值模型在阈值处无跳跃的阈值效应。值得注意的是,虽然为连续阈值模型开发,但所提出的测试对于不连续的情况仍然有效。所提出的检验不需要在备择假设下拟合模型,使其在计算上比准似然比检验更有效。在零假设和局部替代条件下,导出了基于分数的检验统计量的渐近分布。仿真结果表明,该方法比拟似然比检验具有更好的规模,比拉格朗日乘数检验具有更强的有效性。进一步建立了连续阈值ARMA模型的最小二乘估计渐近理论。给出了美国季度平民失业率数据的应用程序。
{"title":"A score-based threshold effect test in time series models","authors":"Shufang Wei ,&nbsp;Yaping Deng ,&nbsp;Yaxing Yang","doi":"10.1016/j.csda.2025.108236","DOIUrl":"10.1016/j.csda.2025.108236","url":null,"abstract":"<div><div>A score-based test statistic is developed to compare a linear ARMA model with its threshold extension. In particular, the focus is on testing the threshold effect in continuous threshold models with no jump at the threshold. Notably, while developed for continuous threshold models, the proposed test remains effective for discontinuous cases. The proposed test does not require fitting the model under the alternative hypothesis, making it computationally more efficient than the quasi-likelihood ratio test. The asymptotic distributions of the score-based test statistic are derived under both the null hypothesis and local alternatives. Simulations indicate that the proposed test has better size than the quasi-likelihood ratio test and demonstrates stronger power compared to the Lagrange Multiplier test. The asymptotic theory of the least square estimation for the continuous threshold ARMA model is further established. An application to the quarterly U.S. civilian unemployment rates data is given.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108236"},"PeriodicalIF":1.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144491125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian selection approach for categorical responses via multinomial probit models 基于多项概率模型的分类响应贝叶斯选择方法
IF 1.5 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-20 DOI: 10.1016/j.csda.2025.108233
Chi-Hsiang Chu , Kuo-Jung Lee , Chien-Chin Hsu , Ray-Bing Chen
A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.
提出了一个多项概率模型来检验分类响应变量,其主要目标是识别模型中的影响变量。为此,贝叶斯选择技术采用了两个层次指标。第一个指标表示变量与分类反应的相关性,随后的指标与变量在特定分类水平上的重要性有关,这有助于评估其在该水平上的影响。选择过程依赖于通过MCMC算法生成的后验指标样本。我们的贝叶斯选择策略的有效性通过模拟和应用到一个现实世界的例子来证明。
{"title":"Bayesian selection approach for categorical responses via multinomial probit models","authors":"Chi-Hsiang Chu ,&nbsp;Kuo-Jung Lee ,&nbsp;Chien-Chin Hsu ,&nbsp;Ray-Bing Chen","doi":"10.1016/j.csda.2025.108233","DOIUrl":"10.1016/j.csda.2025.108233","url":null,"abstract":"<div><div>A multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique using two hierarchical indicators is employed. The first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"212 ","pages":"Article 108233"},"PeriodicalIF":1.5,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics & Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1