首页 > 最新文献

Computational Statistics & Data Analysis最新文献

英文 中文
Expectile periodogram Expectile周期图
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-31 DOI: 10.1016/j.csda.2025.108337
Tianbo Chen , Ta-Hsin Li , Hanbing Zhu , Wenwu Gao
This paper introduces a novel periodogram-like function, called the expectile periodogram (EP), for modeling spectral features of time series and detecting hidden periodicities. The EP is constructed from trigonometric expectile regression (ER), in which a specially designed loss function is used to substitute the squared ℓ2 norm that leads to the ordinary periodogram. The EP retains the key properties of the ordinary periodogram as a frequency-domain representation of serial dependence in time series, while offering a more comprehensive understanding by examining the data across the entire range of expectile levels. The asymptotic theory is established to investigate the relationship between the EP and the so-called expectile spectrum. Simulations demonstrate the efficiency of the EP in the presence of hidden periodicities. In addition, by leveraging the inherent two-dimensional nature of the EP, we train a deep learning model to classify earthquake waveform data. Notably, our approach outperforms alternative periodogram-based methods in terms of classification accuracy.
本文介绍了一种新的类周期图函数,称为期望周期图(EP),用于时间序列的频谱特征建模和隐藏周期检测。EP是由三角期望回归(ER)构造的,其中用一个特殊设计的损失函数来代替导致普通周期图的平方的l2范数。EP保留了普通周期图的关键属性,作为时间序列中序列依赖性的频域表示,同时通过检查整个预期水平范围内的数据,提供了更全面的理解。建立了渐近理论来研究EP与期望谱之间的关系。仿真结果表明了该方法在隐藏周期存在时的有效性。此外,通过利用EP固有的二维特性,我们训练了一个深度学习模型来对地震波形数据进行分类。值得注意的是,我们的方法在分类精度方面优于其他基于周期图的方法。
{"title":"Expectile periodogram","authors":"Tianbo Chen ,&nbsp;Ta-Hsin Li ,&nbsp;Hanbing Zhu ,&nbsp;Wenwu Gao","doi":"10.1016/j.csda.2025.108337","DOIUrl":"10.1016/j.csda.2025.108337","url":null,"abstract":"<div><div>This paper introduces a novel periodogram-like function, called the expectile periodogram (EP), for modeling spectral features of time series and detecting hidden periodicities. The EP is constructed from trigonometric expectile regression (ER), in which a specially designed loss function is used to substitute the squared ℓ<sub>2</sub> norm that leads to the ordinary periodogram. The EP retains the key properties of the ordinary periodogram as a frequency-domain representation of serial dependence in time series, while offering a more comprehensive understanding by examining the data across the entire range of expectile levels. The asymptotic theory is established to investigate the relationship between the EP and the so-called expectile spectrum. Simulations demonstrate the efficiency of the EP in the presence of hidden periodicities. In addition, by leveraging the inherent two-dimensional nature of the EP, we train a deep learning model to classify earthquake waveform data. Notably, our approach outperforms alternative periodogram-based methods in terms of classification accuracy.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108337"},"PeriodicalIF":1.6,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145939215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recursive variational Gaussian approximation with the Whittle likelihood for linear non-Gaussian state space models 线性非高斯状态空间模型的Whittle似然递归变分高斯逼近
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-27 DOI: 10.1016/j.csda.2025.108324
Bao Anh Vu , David Gunawan , Andrew Zammit-Mangion
Parameter inference for linear and non-Gaussian state space models is challenging because the likelihood function contains an intractable integral over the latent state variables. While Markov chain Monte Carlo (MCMC) methods provide exact samples from the posterior distribution as the number of samples goes to infinity, they tend to have high computational cost, particularly for observations of a long time series. When inference with MCMC methods is computationally expensive, variational Bayes (VB) methods are a useful alternative. VB methods approximate the posterior density of the parameters with a simple and tractable distribution found through optimisation. A novel sequential VB algorithm that makes use of the Whittle likelihood is proposed for computationally efficient parameter inference in linear, non-Gaussian state space models. The algorithm, called Recursive Variational Gaussian Approximation with the Whittle Likelihood (R-VGA-Whittle), updates the variational parameters by processing data in the frequency domain. At each iteration, R-VGA-Whittle requires the gradient and Hessian of the Whittle log-likelihood, which are available in closed form. Through several examples involving a linear Gaussian state space model; a univariate/bivariate stochastic volatility model; and a state space model with Student’s t measurement error, where the latent states follow an autoregressive fractionally integrated moving average (ARFIMA) model, R-VGA-Whittle is shown to provide good approximations to posterior distributions of the parameters, and it is very computationally efficient when compared to asymptotically exact methods such as Hamiltonian Monte Carlo.
线性和非高斯状态空间模型的参数推理具有挑战性,因为似然函数包含对潜在状态变量的难以处理的积分。当样本数量趋于无穷大时,马尔可夫链蒙特卡罗(MCMC)方法提供来自后验分布的精确样本,但它们往往具有很高的计算成本,特别是对于长时间序列的观测。当使用MCMC方法进行推理的计算成本很高时,变分贝叶斯(VB)方法是一种有用的替代方法。VB方法近似参数的后验密度,通过优化找到一个简单而易于处理的分布。提出了一种利用Whittle似然的序列VB算法,用于线性非高斯状态空间模型的高效参数推理。该算法被称为递归变分高斯近似与惠特尔似然(R-VGA-Whittle),通过在频域处理数据来更新变分参数。在每次迭代中,r - ga -Whittle需要Whittle对数似然的梯度和Hessian,它们以封闭形式可用。通过几个涉及线性高斯状态空间模型的例子;单变量/双变量随机波动模型;以及具有Student’s t测量误差的状态空间模型,其中潜在状态遵循自回归分数积分移动平均(ARFIMA)模型,R-VGA-Whittle被证明可以很好地近似参数的后验分布,并且与渐近精确方法(如hamilton - Monte Carlo)相比,它的计算效率非常高。
{"title":"Recursive variational Gaussian approximation with the Whittle likelihood for linear non-Gaussian state space models","authors":"Bao Anh Vu ,&nbsp;David Gunawan ,&nbsp;Andrew Zammit-Mangion","doi":"10.1016/j.csda.2025.108324","DOIUrl":"10.1016/j.csda.2025.108324","url":null,"abstract":"<div><div>Parameter inference for linear and non-Gaussian state space models is challenging because the likelihood function contains an intractable integral over the latent state variables. While Markov chain Monte Carlo (MCMC) methods provide exact samples from the posterior distribution as the number of samples goes to infinity, they tend to have high computational cost, particularly for observations of a long time series. When inference with MCMC methods is computationally expensive, variational Bayes (VB) methods are a useful alternative. VB methods approximate the posterior density of the parameters with a simple and tractable distribution found through optimisation. A novel sequential VB algorithm that makes use of the Whittle likelihood is proposed for computationally efficient parameter inference in linear, non-Gaussian state space models. The algorithm, called Recursive Variational Gaussian Approximation with the Whittle Likelihood (R-VGA-Whittle), updates the variational parameters by processing data in the frequency domain. At each iteration, R-VGA-Whittle requires the gradient and Hessian of the Whittle log-likelihood, which are available in closed form. Through several examples involving a linear Gaussian state space model; a univariate/bivariate stochastic volatility model; and a state space model with Student’s t measurement error, where the latent states follow an autoregressive fractionally integrated moving average (ARFIMA) model, R-VGA-Whittle is shown to provide good approximations to posterior distributions of the parameters, and it is very computationally efficient when compared to asymptotically exact methods such as Hamiltonian Monte Carlo.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"218 ","pages":"Article 108324"},"PeriodicalIF":1.6,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Random multiplication versus random sum: Autoregressive-like models with integer-valued random inputs 随机乘法与随机和:具有整数值随机输入的自回归模型
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-27 DOI: 10.1016/j.csda.2025.108323
Abdelhakim Aknouche , Sónia Gouveia , Manuel G. Scotto
A common approach to analyze time series of counts is to fit models based on random sum operators. As an alternative, this paper introduces time series models based on a random multiplication operator, which is simply the multiplication of a variable operand by an integer-valued random coefficient, whose mean is the constant operand. Such an operation is endowed into autoregressive-like models with integer-valued random inputs, addressed as RMINAR. Two special variants are studied, namely the N0-valued random coefficient autoregressive model and the N0-valued random coefficient multiplicative error model. Furthermore, Z-valued extensions are also considered. The dynamic structure of the proposed models is studied in detail. In particular, their corresponding solutions are everywhere strictly stationary and ergodic, which is not common in either the literature on integer-valued time series models or real-valued random coefficient autoregressive models. Therefore, RMINAR model parameters are estimated using a four-stage weighted least squares estimator, with consistency and asymptotic normality established everywhere in the parameter space. Finally, the performance of the new RMINAR models is illustrated with simulated and empirical examples.
分析计数时间序列的一种常用方法是基于随机和运算符拟合模型。作为替代方案,本文介绍了基于随机乘法算子的时间序列模型,即变量操作数乘以整数随机系数,其平均值为常数操作数。这样的操作被赋予具有整数值随机输入的类自回归模型,称为RMINAR。研究了两种特殊的变量,即n0值随机系数自回归模型和n0值随机系数乘法误差模型。此外,还考虑了z值扩展。对模型的动态结构进行了详细的研究。特别是,它们对应的解处处是严格平稳和遍历的,这在整数值时间序列模型和实值随机系数自回归模型的文献中都不常见。因此,使用四阶段加权最小二乘估计器估计RMINAR模型参数,在参数空间各处建立一致性和渐近正态性。最后,通过仿真和实证验证了新模型的性能。
{"title":"Random multiplication versus random sum: Autoregressive-like models with integer-valued random inputs","authors":"Abdelhakim Aknouche ,&nbsp;Sónia Gouveia ,&nbsp;Manuel G. Scotto","doi":"10.1016/j.csda.2025.108323","DOIUrl":"10.1016/j.csda.2025.108323","url":null,"abstract":"<div><div>A common approach to analyze time series of counts is to fit models based on random sum operators. As an alternative, this paper introduces time series models based on a random multiplication operator, which is simply the multiplication of a variable operand by an integer-valued random coefficient, whose mean is the constant operand. Such an operation is endowed into autoregressive-like models with integer-valued random inputs, addressed as RMINAR. Two special variants are studied, namely the <span><math><msub><mi>N</mi><mn>0</mn></msub></math></span>-valued random coefficient autoregressive model and the <span><math><msub><mi>N</mi><mn>0</mn></msub></math></span>-valued random coefficient multiplicative error model. Furthermore, <span><math><mi>Z</mi></math></span>-valued extensions are also considered. The dynamic structure of the proposed models is studied in detail. In particular, their corresponding solutions are everywhere strictly stationary and ergodic, which is not common in either the literature on integer-valued time series models or real-valued random coefficient autoregressive models. Therefore, RMINAR model parameters are estimated using a four-stage weighted least squares estimator, with consistency and asymptotic normality established everywhere in the parameter space. Finally, the performance of the new RMINAR models is illustrated with simulated and empirical examples.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108323"},"PeriodicalIF":1.6,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145939216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pure error REML for analyzing data from multi-stratum designs 用于分析多层设计数据的纯误差REML
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-25 DOI: 10.1016/j.csda.2025.108322
Steven G. Gilmour , Peter Goos , Heiko Großmann
Since the dawn of response surface methodology, it has been recommended that designs include replicate points, so that pure error estimates of variance can be obtained and used to provide reliable estimated standard errors of the effects of factors. In designs with more than one stratum, such as split-plot and split-split-plot designs, it is less obvious how pure error estimates of the variance components should be obtained, and no pure error estimates are given by the popular residual maximum likelihood (REML) method of estimation. A method of pure error REML estimation of the variance components, using the full treatment model, is obtained by treating each combination of factor levels as a discrete treatment. This method is easy to implement using standard software and improved estimated standard errors of the fixed effects estimates can be obtained by applying the Kenward-Roger correction based on the pure error REML estimates. The new method is illustrated using several data sets and the performance of pure error REML is compared with the standard REML method. The results are comparable when the assumed response surface model is correct, but the new method is considerably more robust in the case of model misspecification.
自响应面方法学出现以来,建议设计包括重复点,以便获得方差的纯误差估计,并用于提供可靠的因素影响的估计标准误差。在具有多个地层的设计中,如分裂图和分裂-分裂图设计,如何获得方差分量的纯误差估计不太明显,并且流行的残差最大似然(REML)估计方法没有给出纯误差估计。通过将每个因子水平组合作为离散处理,获得了使用完整处理模型的方差分量的纯误差REML估计方法。该方法易于使用标准软件实现,在纯误差REML估计的基础上应用Kenward-Roger校正,可以得到改进的固定效应估计的估计标准误差。用几个数据集说明了新方法,并将纯误差REML方法与标准REML方法的性能进行了比较。当假设的响应面模型正确时,结果是相当的,但在模型不规范的情况下,新方法的鲁棒性要强得多。
{"title":"Pure error REML for analyzing data from multi-stratum designs","authors":"Steven G. Gilmour ,&nbsp;Peter Goos ,&nbsp;Heiko Großmann","doi":"10.1016/j.csda.2025.108322","DOIUrl":"10.1016/j.csda.2025.108322","url":null,"abstract":"<div><div>Since the dawn of response surface methodology, it has been recommended that designs include replicate points, so that pure error estimates of variance can be obtained and used to provide reliable estimated standard errors of the effects of factors. In designs with more than one stratum, such as split-plot and split-split-plot designs, it is less obvious how pure error estimates of the variance components should be obtained, and no pure error estimates are given by the popular residual maximum likelihood (REML) method of estimation. A method of pure error REML estimation of the variance components, using the full treatment model, is obtained by treating each combination of factor levels as a discrete treatment. This method is easy to implement using standard software and improved estimated standard errors of the fixed effects estimates can be obtained by applying the Kenward-Roger correction based on the pure error REML estimates. The new method is illustrated using several data sets and the performance of pure error REML is compared with the standard REML method. The results are comparable when the assumed response surface model is correct, but the new method is considerably more robust in the case of model misspecification.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"218 ","pages":"Article 108322"},"PeriodicalIF":1.6,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Change-point detection for multivariate nonparametric regression with deep neural networks 基于深度神经网络的多元非参数回归变化点检测
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-25 DOI: 10.1016/j.csda.2025.108334
Houlin Zhou , Hanbing Zhu , Xuejun Wang
This article addresses the problem of detecting structural changes in multivariate nonparametric regression models, which commonly arise in high-dimensional and time-dependent data analysis. We propose a CUSUM-type test statistic constructed from estimators obtained via deep neural networks (DNNs). The theoretical properties of the proposed test statistic are rigorously derived under the null and alternative hypotheses. Under the assumptions of a low-dimensional manifold structure in the data support and a hierarchical model architecture, we demonstrate that the DNN-based change-point detection method can effectively mitigate the curse of dimensionality. Furthermore, we establish the asymptotic properties and derive the convergence rate of the estimator for the change-point location. Extensive comparative simulation studies confirm the effectiveness and superior performance of the proposed approach. Finally, we illustrate the practical applicability of the method through an empirical analysis using real-world regional electricity consumption data.
本文解决了在高维和时变数据分析中常见的多变量非参数回归模型中检测结构变化的问题。我们提出了一种基于深度神经网络(dnn)估计量的cusum型检验统计量。所提出的检验统计量的理论性质是在零假设和备选假设下严格推导出来的。在假设数据支持具有低维流形结构和分层模型结构的前提下,我们证明了基于dnn的变点检测方法可以有效地缓解维数诅咒。进一步,我们建立了渐近性质,并推导了变点位置估计量的收敛速率。大量的仿真对比研究证实了该方法的有效性和优越的性能。最后,通过实际区域用电量数据的实证分析,说明了该方法的实用性。
{"title":"Change-point detection for multivariate nonparametric regression with deep neural networks","authors":"Houlin Zhou ,&nbsp;Hanbing Zhu ,&nbsp;Xuejun Wang","doi":"10.1016/j.csda.2025.108334","DOIUrl":"10.1016/j.csda.2025.108334","url":null,"abstract":"<div><div>This article addresses the problem of detecting structural changes in multivariate nonparametric regression models, which commonly arise in high-dimensional and time-dependent data analysis. We propose a CUSUM-type test statistic constructed from estimators obtained via deep neural networks (DNNs). The theoretical properties of the proposed test statistic are rigorously derived under the null and alternative hypotheses. Under the assumptions of a low-dimensional manifold structure in the data support and a hierarchical model architecture, we demonstrate that the DNN-based change-point detection method can effectively mitigate the curse of dimensionality. Furthermore, we establish the asymptotic properties and derive the convergence rate of the estimator for the change-point location. Extensive comparative simulation studies confirm the effectiveness and superior performance of the proposed approach. Finally, we illustrate the practical applicability of the method through an empirical analysis using real-world regional electricity consumption data.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"218 ","pages":"Article 108334"},"PeriodicalIF":1.6,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145908905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric density estimation on complex domains using manifold-aware Bayesian additive tree models 基于流形感知贝叶斯加性树模型的复域非参数密度估计
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-25 DOI: 10.1016/j.csda.2025.108335
Isaac Diaz-Ray , Huiyan Sang , Guanyu Hu , Ligang Lu
Density or intensity function estimation for point pattern data observed on complex domains finds wide applications in spatial data analysis. However, many existing popular density estimation methods face challenges when domains have irregular boundaries, line network structures, sharp concavities, or interior holes. A nonparametric Bayesian additive ensemble of spanning trees model is developed to model the distribution of event occurrences on complex domains. This model uses a random spanning tree weak learner, which can produce flexible and contiguous domain partitions while respecting its geometry and constraints. The method has the advantage of capturing both varying smoothness and sharp changes in density functions. An efficient exact likelihood-based Bayesian inference algorithm is proposed to estimate the density function with uncertainty measures, leveraging a data thinning strategy combined with Poisson-Gamma conjugacy. Simulation studies on various complex domains demonstrate the advantages of the proposed model over competing methods. The method is further applied to the analysis of basketball shot data and crime locations on a road network.
复杂域上观测到的点模式数据的密度或强度函数估计在空间数据分析中有着广泛的应用。然而,当区域具有不规则边界、线网结构、尖锐凹陷或内部孔洞时,许多现有的流行密度估计方法面临挑战。提出了一种非参数贝叶斯生成树加性集合模型来模拟复杂域上事件发生的分布。该模型采用随机生成树弱学习器,在尊重域的几何形状和约束条件的前提下,生成灵活且连续的域分区。该方法的优点是既能捕捉到变化的平滑度,又能捕捉到密度函数的急剧变化。提出了一种有效的基于精确似然的贝叶斯推理算法,利用数据细化策略与泊松-伽马共轭相结合,利用不确定性测度估计密度函数。对各种复杂领域的仿真研究表明了该模型相对于其他方法的优越性。该方法进一步应用于篮球投篮数据和路网犯罪地点的分析。
{"title":"Nonparametric density estimation on complex domains using manifold-aware Bayesian additive tree models","authors":"Isaac Diaz-Ray ,&nbsp;Huiyan Sang ,&nbsp;Guanyu Hu ,&nbsp;Ligang Lu","doi":"10.1016/j.csda.2025.108335","DOIUrl":"10.1016/j.csda.2025.108335","url":null,"abstract":"<div><div>Density or intensity function estimation for point pattern data observed on complex domains finds wide applications in spatial data analysis. However, many existing popular density estimation methods face challenges when domains have irregular boundaries, line network structures, sharp concavities, or interior holes. A nonparametric Bayesian additive ensemble of spanning trees model is developed to model the distribution of event occurrences on complex domains. This model uses a random spanning tree weak learner, which can produce flexible and contiguous domain partitions while respecting its geometry and constraints. The method has the advantage of capturing both varying smoothness and sharp changes in density functions. An efficient exact likelihood-based Bayesian inference algorithm is proposed to estimate the density function with uncertainty measures, leveraging a data thinning strategy combined with Poisson-Gamma conjugacy. Simulation studies on various complex domains demonstrate the advantages of the proposed model over competing methods. The method is further applied to the analysis of basketball shot data and crime locations on a road network.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108335"},"PeriodicalIF":1.6,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-population sufficient dimension reduction 多种群充分降维
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-24 DOI: 10.1016/j.csda.2025.108321
Xuerong Meggie Wen , Yuexiao Dong , Li-Xing Zhu
A novel dimension-reduction method is introduced for multi-population data. The approach conducts a joint analysis that exploits information shared across populations while accommodating population-specific effects. Unlike partial dimension reduction methods, which identify related directions across all populations, or conditional analyses conducted independently within each population, the proposed two-step procedure leverages cross-population information to enhance estimation accuracy. The methodology is demonstrated through simulations and two real-data applications.
提出了一种新的多种群数据降维方法。该方法进行联合分析,利用不同种群之间共享的信息,同时适应特定种群的影响。与部分降维方法不同,部分降维方法在所有种群中识别相关方向,或在每个种群中独立进行条件分析,所提出的两步法利用跨种群信息来提高估计精度。通过仿真和两个实际数据应用验证了该方法。
{"title":"Multi-population sufficient dimension reduction","authors":"Xuerong Meggie Wen ,&nbsp;Yuexiao Dong ,&nbsp;Li-Xing Zhu","doi":"10.1016/j.csda.2025.108321","DOIUrl":"10.1016/j.csda.2025.108321","url":null,"abstract":"<div><div>A novel dimension-reduction method is introduced for multi-population data. The approach conducts a joint analysis that exploits information shared across populations while accommodating population-specific effects. Unlike partial dimension reduction methods, which identify related directions across all populations, or conditional analyses conducted independently within each population, the proposed two-step procedure leverages cross-population information to enhance estimation accuracy. The methodology is demonstrated through simulations and two real-data applications.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108321"},"PeriodicalIF":1.6,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Seasonal ARIMA models with a random period 随机周期的季节性ARIMA模型
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-09 DOI: 10.1016/j.csda.2025.108320
Abdelhakim Aknouche , Stefanos Dimitrakopoulos , Nadia Rabehi
A general class of seasonal autoregressive integrated moving average models (SARIMA), whose period is an independent and identically distributed random process valued in a finite set, is proposed. This class of models is named random period seasonal ARIMA (SARIMAR). Attention is focused on three subsets of them: the random period seasonal autoregressive (SARR) models, the random period seasonal moving average (SMAR) models and the random period seasonal autoregressive moving average (SARMAR) models. First, the causality, invertibility, and autocovariance shape of these models are revealed. Then, the estimation of the model components (coefficients, innovation variance, probability distribution of the period, (unobserved) sample-path of the random period) is carried out using the Expectation-Maximization algorithm. In addition, a procedure for random elimination of seasonality is developed. A simulation study is conducted to assess the estimation accuracy of the proposed algorithmic scheme. Finally, the usefulness of the proposed methodology is illustrated with two applications about the annual Wolf sunspot numbers and the Canadian lynx data.
提出了一类季节自回归积分移动平均模型(SARIMA),其周期是一个独立的、同分布的随机过程,其值在有限集合内。这类模型被命名为随机周期季节性ARIMA (SARIMAR)。重点研究了其中的三个子集:随机周期季节自回归(SARR)模型、随机周期季节移动平均(SMAR)模型和随机周期季节自回归移动平均(sarar)模型。首先,揭示了这些模型的因果关系、可逆性和自协方差形状。然后,使用期望最大化算法对模型成分(系数、创新方差、周期的概率分布、随机周期的(未观测)样本路径)进行估计。此外,还制定了随机消除季节性的程序。通过仿真研究,验证了所提算法的估计精度。最后,以沃尔夫太阳黑子年数据和加拿大猞猁数据的两个应用实例说明了所提出方法的有效性。
{"title":"Seasonal ARIMA models with a random period","authors":"Abdelhakim Aknouche ,&nbsp;Stefanos Dimitrakopoulos ,&nbsp;Nadia Rabehi","doi":"10.1016/j.csda.2025.108320","DOIUrl":"10.1016/j.csda.2025.108320","url":null,"abstract":"<div><div>A general class of seasonal autoregressive integrated moving average models (SARIMA), whose period is an independent and identically distributed random process valued in a finite set, is proposed. This class of models is named random period seasonal ARIMA (SARIMAR). Attention is focused on three subsets of them: the random period seasonal autoregressive (SARR) models, the random period seasonal moving average (SMAR) models and the random period seasonal autoregressive moving average (SARMAR) models. First, the causality, invertibility, and autocovariance shape of these models are revealed. Then, the estimation of the model components (coefficients, innovation variance, probability distribution of the period, (unobserved) sample-path of the random period) is carried out using the Expectation-Maximization algorithm. In addition, a procedure for random elimination of seasonality is developed. A simulation study is conducted to assess the estimation accuracy of the proposed algorithmic scheme. Finally, the usefulness of the proposed methodology is illustrated with two applications about the annual Wolf sunspot numbers and the Canadian lynx data.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108320"},"PeriodicalIF":1.6,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145798821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Debiased quantile significance testing with machine learning methods 用机器学习方法进行去偏分位数显著性检验
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-05 DOI: 10.1016/j.csda.2025.108319
Jiarong Ding , Yanmei Shi , Niwen Zhou , Mei Yao , Xu Guo
Testing the significance of a subset of covariates for a response is a critical problem with broad applications. A novel nonparametric significance testing procedure is developed to test whether a set of target covariates provides incremental information about the conditional quantile of the response given the other covariates. The proposed test statistics are constructed within the framework of debiased machine learning, which enables flexible estimation of unknown functions by leveraging machine learning methods. The asymptotic properties of the proposed test statistic under the null hypothesis are established, and the power under the alternatives is analyzed, demonstrating the ability of the procedure to detect local alternatives at the optimal parametric rate. To further enhance power, an ensemble quantile significance testing procedure is introduced. Extensive numerical studies and real data applications are conducted to illustrate the finite-sample performance of the proposed testing procedures.
检验一个响应的协变量子集的显著性是一个具有广泛应用的关键问题。开发了一种新的非参数显著性检验程序,用于检验一组目标协变量是否提供了关于给定其他协变量的响应的条件分位数的增量信息。所提出的测试统计量是在去偏机器学习的框架内构建的,通过利用机器学习方法,可以灵活地估计未知函数。建立了零假设下检验统计量的渐近性质,并分析了备选项下的幂次,证明了该方法能够以最优参数率检测局部备选项。为了进一步提高有效性,引入了集成分位数显著性检验程序。广泛的数值研究和实际数据应用进行了说明有限样本性能的测试程序所提出的。
{"title":"Debiased quantile significance testing with machine learning methods","authors":"Jiarong Ding ,&nbsp;Yanmei Shi ,&nbsp;Niwen Zhou ,&nbsp;Mei Yao ,&nbsp;Xu Guo","doi":"10.1016/j.csda.2025.108319","DOIUrl":"10.1016/j.csda.2025.108319","url":null,"abstract":"<div><div>Testing the significance of a subset of covariates for a response is a critical problem with broad applications. A novel nonparametric significance testing procedure is developed to test whether a set of target covariates provides incremental information about the conditional quantile of the response given the other covariates. The proposed test statistics are constructed within the framework of debiased machine learning, which enables flexible estimation of unknown functions by leveraging machine learning methods. The asymptotic properties of the proposed test statistic under the null hypothesis are established, and the power under the alternatives is analyzed, demonstrating the ability of the procedure to detect local alternatives at the optimal parametric rate. To further enhance power, an ensemble quantile significance testing procedure is introduced. Extensive numerical studies and real data applications are conducted to illustrate the finite-sample performance of the proposed testing procedures.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"217 ","pages":"Article 108319"},"PeriodicalIF":1.6,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145712512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Certifiably optimal direction estimation in sparse single-index model 稀疏单指标模型的可证明最优方向估计
IF 1.6 3区 数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-01 DOI: 10.1016/j.csda.2025.108307
Yangzhou Chen , Lei Yan , Xin Chen , Shuaida He
In this paper, we propose a novel method for coefficient estimation in sparse single-index models (SIM). Our approach employs a customized branch-and-bound algorithm to efficiently solve the non-convex problem of sparse direction estimation, which arises from the discrete nature of variable selection. To address this non-convex optimization problem, we derive upper bounds using techniques such as spectral decomposition, matrix inequalities, and the Gershgorin circle theorem, while the lower bounds are obtained through methods like vector truncation and adaptations of the Rifle algorithm. Furthermore, we design customized branching and node selection strategies, with hyperparameters chosen based on AIC, BIC, and HBIC criteria. We prove the convergence of our algorithm, ensuring it reliably reaches optimal solutions. Extensive simulation studies and real data analysis further illustrate the reliable performance and applicability of our proposed method.
本文提出了一种稀疏单指标模型(SIM)的系数估计新方法。该方法采用自定义分支定界算法,有效地解决了稀疏方向估计的非凸问题,该问题源于变量选择的离散性。为了解决这个非凸优化问题,我们使用谱分解、矩阵不等式和Gershgorin圆定理等技术推导出上界,而下界则通过向量截断和改进Rifle算法等方法获得。此外,我们设计了定制的分支和节点选择策略,并根据AIC, BIC和HBIC标准选择超参数。证明了算法的收敛性,保证了算法能可靠地得到最优解。大量的仿真研究和实际数据分析进一步证明了该方法的可靠性和适用性。
{"title":"Certifiably optimal direction estimation in sparse single-index model","authors":"Yangzhou Chen ,&nbsp;Lei Yan ,&nbsp;Xin Chen ,&nbsp;Shuaida He","doi":"10.1016/j.csda.2025.108307","DOIUrl":"10.1016/j.csda.2025.108307","url":null,"abstract":"<div><div>In this paper, we propose a novel method for coefficient estimation in sparse single-index models (SIM). Our approach employs a customized branch-and-bound algorithm to efficiently solve the non-convex problem of sparse direction estimation, which arises from the discrete nature of variable selection. To address this non-convex optimization problem, we derive upper bounds using techniques such as spectral decomposition, matrix inequalities, and the Gershgorin circle theorem, while the lower bounds are obtained through methods like vector truncation and adaptations of the Rifle algorithm. Furthermore, we design customized branching and node selection strategies, with hyperparameters chosen based on AIC, BIC, and HBIC criteria. We prove the convergence of our algorithm, ensuring it reliably reaches optimal solutions. Extensive simulation studies and real data analysis further illustrate the reliable performance and applicability of our proposed method.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"219 ","pages":"Article 108307"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics & Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1