首页 > 最新文献

Journal of Multivariate Analysis最新文献

英文 中文
Estimation for partially time-varying spatial autoregressive panel data model under linear constraints 线性约束下部分时变空间自回归面板数据模型的估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-19 DOI: 10.1016/j.jmva.2025.105547
Lingling Tian , Chuanhua Wei , Bing Sun , Mixia Wu
This paper investigates a constrained spatial autoregressive panel data model with fixed effects, partially linear time-varying coefficients, and time-varying spatial dependence. We propose a constrained profile two-stage least squares estimator and establish its asymptotic properties. Furthermore, a statistical test is constructed to examine whether the constant coefficients satisfy pre-specified linear constraints. Monte Carlo simulations under both independent and α-mixing error structures demonstrate the finite-sample performance of the proposed estimators and testing procedure. A real data example is provided to illustrate the practical applicability of the method. In addition, when the time dimension T is relatively small, a Block Bootstrap procedure is proposed to compute the p-value for the test.
本文研究了具有固定效应、部分线性时变系数和时变空间依赖性的约束性空间自回归面板数据模型。提出了一种约束轮廓两阶段最小二乘估计,并建立了它的渐近性质。此外,构造了一个统计检验来检验常系数是否满足预先规定的线性约束。在独立误差结构和α-混合误差结构下的蒙特卡罗模拟验证了所提估计器和测试方法的有限样本性能。通过一个实际的数据算例说明了该方法的实用性。此外,当时间维T较小时,提出了Block Bootstrap方法来计算检验的p值。
{"title":"Estimation for partially time-varying spatial autoregressive panel data model under linear constraints","authors":"Lingling Tian ,&nbsp;Chuanhua Wei ,&nbsp;Bing Sun ,&nbsp;Mixia Wu","doi":"10.1016/j.jmva.2025.105547","DOIUrl":"10.1016/j.jmva.2025.105547","url":null,"abstract":"<div><div>This paper investigates a constrained spatial autoregressive panel data model with fixed effects, partially linear time-varying coefficients, and time-varying spatial dependence. We propose a constrained profile two-stage least squares estimator and establish its asymptotic properties. Furthermore, a statistical test is constructed to examine whether the constant coefficients satisfy pre-specified linear constraints. Monte Carlo simulations under both independent and <span><math><mi>α</mi></math></span>-mixing error structures demonstrate the finite-sample performance of the proposed estimators and testing procedure. A real data example is provided to illustrate the practical applicability of the method. In addition, when the time dimension <span><math><mi>T</mi></math></span> is relatively small, a Block Bootstrap procedure is proposed to compute the <span><math><mi>p</mi></math></span>-value for the test.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105547"},"PeriodicalIF":1.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A componentwise estimation procedure for multivariate location and scatter: Robustness, efficiency and scalability 多变量定位和分散的组件估计方法:鲁棒性、效率和可扩展性
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-19 DOI: 10.1016/j.jmva.2025.105546
Soumya Chakraborty , Ayanendranath Basu , Abhik Ghosh
Covariance matrix estimation is an important problem in multivariate data analysis, both from theoretical as well as applied points of view. Many simple and popular covariance matrix estimators are known to be severely affected by model misspecification and the presence of outliers in the data; on the other hand robust estimators with reasonably high efficiency are often computationally challenging for modern large and complex datasets. In this work, we propose a new, simple, robust and highly efficient method for estimation of the location vector and the scatter matrix for elliptically symmetric distributions. The proposed estimation procedure is designed in the spirit of the minimum density power divergence (DPD) estimation approach with appropriate modifications which makes our proposal (componentwise minimum DPD estimation) computationally very economical and scalable to large as well as higher dimensional datasets. Consistency and asymptotic normality of the proposed componentwise estimators of the multivariate location and scatter are established along with asymptotic positive definiteness of the estimated scatter matrix. Robustness of our estimators are studied by means of influence functions. All theoretical results are illustrated further under multivariate normality. A large-scale simulation study is presented to assess finite sample performances and scalability of our method in comparison to the usual maximum likelihood estimator (MLE), the ordinary minimum DPD estimator (MDPDE) and other popular non-parametric methods. The applicability of our method is further illustrated with a real dataset on credit card transactions.
协方差矩阵估计是多变量数据分析中的一个重要问题,无论从理论还是应用的角度来看都是如此。众所周知,许多简单和流行的协方差矩阵估计受到模型不规范和数据中存在异常值的严重影响;另一方面,对于现代大型和复杂的数据集,具有相当高效率的鲁棒估计器通常在计算上具有挑战性。在这项工作中,我们提出了一种新的,简单,鲁棒和高效的方法来估计椭圆对称分布的位置向量和散点矩阵。所提出的估计程序是根据最小密度功率散度(DPD)估计方法的精神设计的,并进行了适当的修改,这使得我们的建议(组件最小DPD估计)在计算上非常经济,并且可扩展到大型和高维数据集。建立了多元位置和散点的分量估计的一致性和渐近正态性,以及估计的散点矩阵的渐近正定性。利用影响函数研究了估计量的鲁棒性。所有理论结果在多元正态性下得到进一步说明。通过大规模的仿真研究,与常用的极大似然估计器(MLE)、普通最小DPD估计器(MDPDE)和其他流行的非参数方法相比,评估了该方法的有限样本性能和可扩展性。通过信用卡交易的真实数据集进一步说明了我们方法的适用性。
{"title":"A componentwise estimation procedure for multivariate location and scatter: Robustness, efficiency and scalability","authors":"Soumya Chakraborty ,&nbsp;Ayanendranath Basu ,&nbsp;Abhik Ghosh","doi":"10.1016/j.jmva.2025.105546","DOIUrl":"10.1016/j.jmva.2025.105546","url":null,"abstract":"<div><div>Covariance matrix estimation is an important problem in multivariate data analysis, both from theoretical as well as applied points of view. Many simple and popular covariance matrix estimators are known to be severely affected by model misspecification and the presence of outliers in the data; on the other hand robust estimators with reasonably high efficiency are often computationally challenging for modern large and complex datasets. In this work, we propose a new, simple, robust and highly efficient method for estimation of the location vector and the scatter matrix for elliptically symmetric distributions. The proposed estimation procedure is designed in the spirit of the minimum density power divergence (DPD) estimation approach with appropriate modifications which makes our proposal (componentwise minimum DPD estimation) computationally very economical and scalable to large as well as higher dimensional datasets. Consistency and asymptotic normality of the proposed componentwise estimators of the multivariate location and scatter are established along with asymptotic positive definiteness of the estimated scatter matrix. Robustness of our estimators are studied by means of influence functions. All theoretical results are illustrated further under multivariate normality. A large-scale simulation study is presented to assess finite sample performances and scalability of our method in comparison to the usual maximum likelihood estimator (MLE), the ordinary minimum DPD estimator (MDPDE) and other popular non-parametric methods. The applicability of our method is further illustrated with a real dataset on credit card transactions.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105546"},"PeriodicalIF":1.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145616468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symmetric Bernoulli distributions and minimal dependence copulas 对称伯努利分布和最小依赖公式
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-17 DOI: 10.1016/j.jmva.2025.105545
Alessandro Mutti, Patrizia Semeraro
The key result of this paper is to characterize all multivariate symmetric Bernoulli distributions whose sum is minimal under the convex order. In doing so, we automatically characterize extremal negative dependence among Bernoulli variables, since multivariate distributions with minimal convex sums are known to be strongly negative dependent. Moreover, beyond its interest per se, this result provides insight into negative dependence within the class of copulas. In particular, two classes of copulas can be built from multivariate symmetric Bernoulli distributions: extremal mixture copulas and FGM copulas. We analyze the extremal negative dependence structures of copulas constructed from symmetric Bernoulli vectors with minimal convex sums and explicitly find a class of minimal dependence copulas. This analysis is completed by investigating minimal pairwise dependence measures and correlations. Our main results derive from the geometric and algebraic representations of multivariate symmetric Bernoulli distributions, which effectively encode key statistical properties.
本文的关键结果是刻画了凸阶下和最小的所有多元对称伯努利分布。在这样做时,我们自动表征伯努利变量之间的极值负相关,因为已知最小凸和的多元分布是强负相关的。此外,除了其本身的兴趣之外,该结果还提供了对copulas类内负依赖的见解。特别地,可以从多元对称伯努利分布中建立两类copula:极值混合copula和FGM copula。本文分析了具有最小凸和的对称伯努利向量的极负相关结构,明确地找到了一类最小相关的copula。该分析是通过调查最小两两依赖度量和相关性来完成的。我们的主要结果来自多元对称伯努利分布的几何和代数表示,它有效地编码了关键的统计性质。
{"title":"Symmetric Bernoulli distributions and minimal dependence copulas","authors":"Alessandro Mutti,&nbsp;Patrizia Semeraro","doi":"10.1016/j.jmva.2025.105545","DOIUrl":"10.1016/j.jmva.2025.105545","url":null,"abstract":"<div><div>The key result of this paper is to characterize all multivariate symmetric Bernoulli distributions whose sum is minimal under the convex order. In doing so, we automatically characterize extremal negative dependence among Bernoulli variables, since multivariate distributions with minimal convex sums are known to be strongly negative dependent. Moreover, beyond its interest per se, this result provides insight into negative dependence within the class of copulas. In particular, two classes of copulas can be built from multivariate symmetric Bernoulli distributions: extremal mixture copulas and FGM copulas. We analyze the extremal negative dependence structures of copulas constructed from symmetric Bernoulli vectors with minimal convex sums and explicitly find a class of minimal dependence copulas. This analysis is completed by investigating minimal pairwise dependence measures and correlations. Our main results derive from the geometric and algebraic representations of multivariate symmetric Bernoulli distributions, which effectively encode key statistical properties.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105545"},"PeriodicalIF":1.4,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145537556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic arrangement increasing property of skew-normal distributions 偏正态分布的随机排列递增特性
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-17 DOI: 10.1016/j.jmva.2025.105544
Jiajie Lu, Xiaohu Li
In this study, we investigate both sufficient and necessary conditions for bivariate skew-normal distributions to be stochastic arrangement increasing. The main results serve as either natural extension of or nice supplement to the characterization result of this property for bivariate normal distributions due to Cai and Wei (2015). Also, we generalize these results to multivariate skew-normal distributions. Numerical examples based on the theory and a real data are presented to illustrate the main results as well.
本文研究了二元偏正态分布是随机排列递增的充要条件。主要结果是Cai和Wei(2015)对二元正态分布的这一性质的表征结果的自然扩展或很好的补充。此外,我们将这些结果推广到多元偏正态分布。最后给出了基于理论和实际数据的数值算例来说明主要结果。
{"title":"Stochastic arrangement increasing property of skew-normal distributions","authors":"Jiajie Lu,&nbsp;Xiaohu Li","doi":"10.1016/j.jmva.2025.105544","DOIUrl":"10.1016/j.jmva.2025.105544","url":null,"abstract":"<div><div>In this study, we investigate both sufficient and necessary conditions for bivariate skew-normal distributions to be stochastic arrangement increasing. The main results serve as either natural extension of or nice supplement to the characterization result of this property for bivariate normal distributions due to Cai and Wei (2015). Also, we generalize these results to multivariate skew-normal distributions. Numerical examples based on the theory and a real data are presented to illustrate the main results as well.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105544"},"PeriodicalIF":1.4,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145537557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recovering Imbalanced Clusters via gradient-based projection pursuit 基于梯度投影追踪的不平衡簇恢复方法
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-15 DOI: 10.1016/j.jmva.2025.105530
Martin Eppert , Satyaki Mukherjee , Debarghya Ghoshdastidar
Projection Pursuit is a classic exploratory technique for finding interesting projections of a dataset. We propose a method for recovering projections containing either Imbalanced Clusters or a Bernoulli–Rademacher distribution using a gradient-based technique to optimize the projection index. As sample complexity is a major limiting factor in Projection Pursuit, we analyze our algorithm’s sample complexity within a Planted Vector setting where we can observe that Imbalanced Clusters can be recovered more easily than balanced ones. Additionally, we give a generalized result that works for a variety of data distributions and projection indices. We compare these results to computational lower bounds in the Low-Degree-Polynomial Framework. Finally, we experimentally evaluate our method’s applicability to real-world data using FashionMNIST and the Human Activity Recognition Dataset, where our algorithm outperforms others when only a few samples are available.
投影追踪是一种经典的探索性技术,用于寻找数据集的有趣投影。我们提出了一种使用基于梯度的技术来优化投影指数的方法来恢复包含不平衡簇或伯努利-拉德马赫分布的投影。由于样本复杂度是投影追踪的主要限制因素,我们在一个种植向量设置中分析了我们的算法的样本复杂度,我们可以观察到不平衡的集群比平衡的集群更容易恢复。此外,我们还给出了一个适用于各种数据分布和投影指标的广义结果。我们将这些结果与低次多项式框架中的计算下界进行比较。最后,我们通过实验评估了我们的方法对现实世界数据的适用性,使用FashionMNIST和人类活动识别数据集,当只有少数样本可用时,我们的算法优于其他算法。
{"title":"Recovering Imbalanced Clusters via gradient-based projection pursuit","authors":"Martin Eppert ,&nbsp;Satyaki Mukherjee ,&nbsp;Debarghya Ghoshdastidar","doi":"10.1016/j.jmva.2025.105530","DOIUrl":"10.1016/j.jmva.2025.105530","url":null,"abstract":"<div><div>Projection Pursuit is a classic exploratory technique for finding interesting projections of a dataset. We propose a method for recovering projections containing either Imbalanced Clusters or a Bernoulli–Rademacher distribution using a gradient-based technique to optimize the projection index. As sample complexity is a major limiting factor in Projection Pursuit, we analyze our algorithm’s sample complexity within a Planted Vector setting where we can observe that Imbalanced Clusters can be recovered more easily than balanced ones. Additionally, we give a generalized result that works for a variety of data distributions and projection indices. We compare these results to computational lower bounds in the Low-Degree-Polynomial Framework. Finally, we experimentally evaluate our method’s applicability to real-world data using FashionMNIST and the Human Activity Recognition Dataset, where our algorithm outperforms others when only a few samples are available.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105530"},"PeriodicalIF":1.4,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145616469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent advances in principal component analysis for directional data 定向数据主成分分析的新进展
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-15 DOI: 10.1016/j.jmva.2025.105528
Anahita Nodehi , Meisam Moghimbeygi , Christophe Ley
The high dimensionality of the input data can pose multiple problems when implementing statistical techniques. The presence of many dimensions in the data can lead to challenges in visualizing the data, higher computational demands, and a higher probability of over-fitting or under-fitting in modeling. Furthermore, the curse of dimensionality contributes to these issues by stating that the necessary number of observations for accurate modeling increases exponentially as the number of dimensions increases. Dimension reduction tools help overcome this challenge. Principal Component Analysis (PCA) is the most widely used technique, intensively studied in classical linear spaces. However, in applied sciences such as biology, bioinformatics, astronomy and geology, there are many instances in which the data’s support are non-Euclidean spaces. In fact, the available data often include elements of Riemannian manifolds such as the unit circle, torus, sphere, and their extensions. Therefore, the terms “manifold-valued” or “directional” data are used in the literature for these situations. When dealing with directional data, the linear nature of PCA might pose a challenge to achieve accurate data reduction. This paper therefore reviews and investigates the methodological aspects of PCA on directional data and their practical applications.
在实现统计技术时,输入数据的高维可能会带来多个问题。数据中存在许多维度可能会导致数据可视化的挑战,更高的计算需求,以及建模中过度拟合或欠拟合的更高概率。此外,维度的诅咒通过指出精确建模所需的观测数量随着维度数量的增加而呈指数增长来促成这些问题。降维工具有助于克服这一挑战。主成分分析(PCA)是应用最广泛的技术,在经典线性空间中得到了深入的研究。然而,在诸如生物学、生物信息学、天文学和地质学等应用科学中,有许多情况下数据的支持是非欧几里得空间。事实上,可用的数据通常包括黎曼流形的元素,如单位圆、环面、球体及其扩展。因此,术语“流形值”或“定向”数据在这些情况下的文献中使用。在处理方向数据时,主成分分析的线性特性可能会对实现准确的数据约简提出挑战。因此,本文回顾和探讨了主成分分析在定向数据上的方法学方面及其实际应用。
{"title":"Recent advances in principal component analysis for directional data","authors":"Anahita Nodehi ,&nbsp;Meisam Moghimbeygi ,&nbsp;Christophe Ley","doi":"10.1016/j.jmva.2025.105528","DOIUrl":"10.1016/j.jmva.2025.105528","url":null,"abstract":"<div><div>The high dimensionality of the input data can pose multiple problems when implementing statistical techniques. The presence of many dimensions in the data can lead to challenges in visualizing the data, higher computational demands, and a higher probability of over-fitting or under-fitting in modeling. Furthermore, the curse of dimensionality contributes to these issues by stating that the necessary number of observations for accurate modeling increases exponentially as the number of dimensions increases. Dimension reduction tools help overcome this challenge. Principal Component Analysis (PCA) is the most widely used technique, intensively studied in classical linear spaces. However, in applied sciences such as biology, bioinformatics, astronomy and geology, there are many instances in which the data’s support are non-Euclidean spaces. In fact, the available data often include elements of Riemannian manifolds such as the unit circle, torus, sphere, and their extensions. Therefore, the terms “manifold-valued” or “directional” data are used in the literature for these situations. When dealing with directional data, the linear nature of PCA might pose a challenge to achieve accurate data reduction. This paper therefore reviews and investigates the methodological aspects of PCA on directional data and their practical applications.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105528"},"PeriodicalIF":1.4,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145537554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonlinear functional principal component analysis using neural networks 基于神经网络的非线性泛函主成分分析
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-15 DOI: 10.1016/j.jmva.2025.105526
Rou Zhong , Jingxiao Zhang , Chunming Zhang
Functional principal component analysis (FPCA) is an important technique for dimension reduction in functional data analysis (FDA). Classical FPCA method is based on the Karhunen-Loève expansion, which assumes a linear structure of the observed functional data. However, the assumption may not always be satisfied, and the FPCA method can become inefficient when the data deviates from the linear assumption. In this paper, we propose a novel FPCA method that is suitable for data with a nonlinear structure with the use of neural networks. We construct networks that can be applied to functional data and explore the corresponding universal approximation property. The main use of our proposed nonlinear FPCA method is curve reconstruction. We conduct a simulation study to evaluate the performance of our method. The proposed method is also applied to a real-world data set to further demonstrate its superiority.
功能主成分分析(FPCA)是功能数据分析(FDA)中重要的降维技术。经典的FPCA方法基于karhunen - lo展开式,它假定观测到的函数数据具有线性结构。然而,假设并不总是满足的,当数据偏离线性假设时,FPCA方法会变得低效。在本文中,我们提出了一种新的FPCA方法,该方法适用于具有非线性结构的数据,并使用神经网络。我们构建了可以应用于函数数据的网络,并探索了相应的普遍逼近性质。本文提出的非线性FPCA方法的主要用途是曲线重构。我们进行了模拟研究来评估我们的方法的性能。并将该方法应用于实际数据集,进一步证明了其优越性。
{"title":"Nonlinear functional principal component analysis using neural networks","authors":"Rou Zhong ,&nbsp;Jingxiao Zhang ,&nbsp;Chunming Zhang","doi":"10.1016/j.jmva.2025.105526","DOIUrl":"10.1016/j.jmva.2025.105526","url":null,"abstract":"<div><div>Functional principal component analysis (FPCA) is an important technique for dimension reduction in functional data analysis (FDA). Classical FPCA method is based on the Karhunen-Loève expansion, which assumes a linear structure of the observed functional data. However, the assumption may not always be satisfied, and the FPCA method can become inefficient when the data deviates from the linear assumption. In this paper, we propose a novel FPCA method that is suitable for data with a nonlinear structure with the use of neural networks. We construct networks that can be applied to functional data and explore the corresponding universal approximation property. The main use of our proposed nonlinear FPCA method is curve reconstruction. We conduct a simulation study to evaluate the performance of our method. The proposed method is also applied to a real-world data set to further demonstrate its superiority.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105526"},"PeriodicalIF":1.4,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145537559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust variable selection criteria for the penalized regression 惩罚回归的稳健变量选择标准
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-14 DOI: 10.1016/j.jmva.2025.105540
Abhijit Mandal , Samiran Ghosh
We develop a robust variable selection framework that integrates divergence-based M-estimation with penalization. The proposed method yields regression parameter estimates that are resistant to outliers while simultaneously identifying the most relevant explanatory variables. The asymptotic distribution and influence function of the estimators are derived. Classical model selection criteria such as Mallows’ Cp and the Akaike information criterion (AIC) are known to deteriorate under heavy-tailed errors or contamination. To address this issue, we introduce robust counterparts of these criteria, constructed from our divergence-based estimators. The proposed approach substantially improves variable selection and prediction performance in the presence of outliers, while maintaining competitiveness with state-of-the-art robust high-dimensional methods. The practical utility of the procedure is further demonstrated through an analysis of the plasma Beta-Carotene dataset.
我们开发了一个鲁棒的变量选择框架,该框架集成了基于散度的m估计和惩罚。所提出的方法产生的回归参数估计是抵抗异常值,同时确定最相关的解释变量。导出了估计量的渐近分布和影响函数。已知经典的模型选择标准,如Mallows’Cp和赤池信息标准(AIC)在重尾误差或污染下会恶化。为了解决这个问题,我们引入了这些标准的健壮对应物,从我们基于散度的估计器构造。所提出的方法大大提高了异常值存在下的变量选择和预测性能,同时保持了与最先进的鲁棒高维方法的竞争力。通过对血浆β -胡萝卜素数据集的分析,进一步证明了该程序的实际效用。
{"title":"Robust variable selection criteria for the penalized regression","authors":"Abhijit Mandal ,&nbsp;Samiran Ghosh","doi":"10.1016/j.jmva.2025.105540","DOIUrl":"10.1016/j.jmva.2025.105540","url":null,"abstract":"<div><div>We develop a robust variable selection framework that integrates divergence-based M-estimation with penalization. The proposed method yields regression parameter estimates that are resistant to outliers while simultaneously identifying the most relevant explanatory variables. The asymptotic distribution and influence function of the estimators are derived. Classical model selection criteria such as Mallows’ <span><math><msub><mrow><mi>C</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> and the Akaike information criterion (AIC) are known to deteriorate under heavy-tailed errors or contamination. To address this issue, we introduce robust counterparts of these criteria, constructed from our divergence-based estimators. The proposed approach substantially improves variable selection and prediction performance in the presence of outliers, while maintaining competitiveness with state-of-the-art robust high-dimensional methods. The practical utility of the procedure is further demonstrated through an analysis of the plasma Beta-Carotene dataset.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105540"},"PeriodicalIF":1.4,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Density and graph estimation with smoothing splines and conditional Gaussian graphical models 平滑样条和条件高斯图模型的密度和图估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-14 DOI: 10.1016/j.jmva.2025.105543
Runfei Luo , Anna Liu , Hao Dong , Yuedong Wang
Density estimation and graphical models play important roles in statistical learning. The estimated density can be used to construct a graphical model that reveals conditional relationships, whereas a graphical structure can be used to build models for density estimation. We propose a semiparametric framework that models part of the density function nonparametrically using a smoothing spline ANOVA (SS ANOVA) model and the conditional density parametrically using a conditional Gaussian graphical model (cGGM). This flexible framework allows us to deal with high-dimensional data without the Gaussian assumption. We develop computationally efficient algorithms for estimation and provide theoretical guarantees for our procedure. Our experimental results show that the proposed framework outperforms both parametric and nonparametric baselines.
密度估计和图形模型在统计学习中起着重要的作用。估计的密度可用于构建显示条件关系的图形模型,而图形结构可用于构建密度估计模型。我们提出了一个半参数框架,使用平滑样条方差分析(SS ANOVA)模型非参数地建模部分密度函数,使用条件高斯图形模型(cGGM)参数地建模部分条件密度函数。这个灵活的框架允许我们在没有高斯假设的情况下处理高维数据。我们开发了计算效率高的估计算法,并为我们的程序提供了理论保证。我们的实验结果表明,所提出的框架优于参数基线和非参数基线。
{"title":"Density and graph estimation with smoothing splines and conditional Gaussian graphical models","authors":"Runfei Luo ,&nbsp;Anna Liu ,&nbsp;Hao Dong ,&nbsp;Yuedong Wang","doi":"10.1016/j.jmva.2025.105543","DOIUrl":"10.1016/j.jmva.2025.105543","url":null,"abstract":"<div><div>Density estimation and graphical models play important roles in statistical learning. The estimated density can be used to construct a graphical model that reveals conditional relationships, whereas a graphical structure can be used to build models for density estimation. We propose a semiparametric framework that models part of the density function nonparametrically using a smoothing spline ANOVA (SS ANOVA) model and the conditional density parametrically using a conditional Gaussian graphical model (cGGM). This flexible framework allows us to deal with high-dimensional data without the Gaussian assumption. We develop computationally efficient algorithms for estimation and provide theoretical guarantees for our procedure. Our experimental results show that the proposed framework outperforms both parametric and nonparametric baselines.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105543"},"PeriodicalIF":1.4,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145537555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the use of the Gram matrix for multivariate functional principal components analysis 格拉姆矩阵在多元泛函主成分分析中的应用
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-13 DOI: 10.1016/j.jmva.2025.105525
Steven Golovkine , Edward Gunning , Andrew J. Simpkin , Norma Bargary
Dimension reduction is crucial in functional data analysis (FDA). The key tool to reduce the dimension of the data is functional principal component analysis. Existing approaches for functional principal component analysis usually involve the diagonalization of the covariance operator. With the increasing size and complexity of functional datasets, estimating the covariance operator has become more challenging. Therefore, there is a growing need for efficient methodologies to estimate the eigencomponents. Using the duality of the space of observations and the space of functional features, we propose to use the inner-product between the curves to estimate the eigenelements of multivariate and multidimensional functional datasets. The relationship between the eigenelements of the covariance operator and those of the inner-product matrix is established. We explore the application of these methodologies in several FDA settings and provide general guidance on their usability.
降维在功能数据分析(FDA)中是至关重要的。功能主成分分析是降低数据维数的关键工具。现有的功能主成分分析方法通常涉及协方差算子的对角化。随着功能数据集的规模和复杂性的增加,协方差算子的估计变得越来越具有挑战性。因此,越来越需要有效的方法来估计特征分量。利用观测值空间和功能特征空间的对偶性,提出利用曲线间的内积来估计多元和多维功能数据集的特征元素。建立了协方差算子特征元与内积矩阵特征元之间的关系。我们探索这些方法在几个FDA设置中的应用,并提供关于其可用性的一般指导。
{"title":"On the use of the Gram matrix for multivariate functional principal components analysis","authors":"Steven Golovkine ,&nbsp;Edward Gunning ,&nbsp;Andrew J. Simpkin ,&nbsp;Norma Bargary","doi":"10.1016/j.jmva.2025.105525","DOIUrl":"10.1016/j.jmva.2025.105525","url":null,"abstract":"<div><div>Dimension reduction is crucial in functional data analysis (FDA). The key tool to reduce the dimension of the data is functional principal component analysis. Existing approaches for functional principal component analysis usually involve the diagonalization of the covariance operator. With the increasing size and complexity of functional datasets, estimating the covariance operator has become more challenging. Therefore, there is a growing need for efficient methodologies to estimate the eigencomponents. Using the duality of the space of observations and the space of functional features, we propose to use the inner-product between the curves to estimate the eigenelements of multivariate and multidimensional functional datasets. The relationship between the eigenelements of the covariance operator and those of the inner-product matrix is established. We explore the application of these methodologies in several FDA settings and provide general guidance on their usability.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105525"},"PeriodicalIF":1.4,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Multivariate Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1