首页 > 最新文献

Journal of Multivariate Analysis最新文献

英文 中文
A method for sparse and robust independent component analysis 稀疏鲁棒独立分量分析方法
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-10 DOI: 10.1016/j.jmva.2025.105587
Lauri Heinonen, Joni Virta
This work presents sparse invariant coordinate selection, SICS, a new method for sparse and robust independent component analysis. SICS is based on classical invariant coordinate selection, which is presented in such a form that a LASSO-type penalty can be applied to promote sparsity. Robustness is achieved by using robust scatter matrices. In the first part of the paper, the background and building blocks: scatter matrices, measures of robustness, ICS and independent component analysis, are carefully introduced. Then the proposed new method and its algorithm are derived and presented. This part also includes consistency and breakdown point results for a general case of sparse ICS-like methods. The performance of SICS in identifying sparse independent component loadings is investigated with multiple simulations. The method is illustrated with an example in constructing sparse causal graphs and we also propose a graphical tool for selecting the appropriate sparsity level in SICS.
本文提出了稀疏不变坐标选择方法(SICS),一种稀疏鲁棒独立分量分析的新方法。SICS基于经典的不变坐标选择,它以lasso类型的惩罚可以应用于提高稀疏性的形式呈现。鲁棒性是通过使用鲁棒散点矩阵实现的。在论文的第一部分,详细介绍了背景和构建模块:散点矩阵、鲁棒性度量、ICS和独立成分分析。然后推导并给出了新方法及其算法。这一部分还包括稀疏类ics方法的一般情况下的一致性和故障点结果。通过多个仿真研究了稀疏独立分量载荷识别的性能。该方法以构建稀疏因果图为例进行了说明,并提出了一种图形化工具来选择合适的稀疏度级别。
{"title":"A method for sparse and robust independent component analysis","authors":"Lauri Heinonen,&nbsp;Joni Virta","doi":"10.1016/j.jmva.2025.105587","DOIUrl":"10.1016/j.jmva.2025.105587","url":null,"abstract":"<div><div>This work presents sparse invariant coordinate selection, SICS, a new method for sparse and robust independent component analysis. SICS is based on classical invariant coordinate selection, which is presented in such a form that a LASSO-type penalty can be applied to promote sparsity. Robustness is achieved by using robust scatter matrices. In the first part of the paper, the background and building blocks: scatter matrices, measures of robustness, ICS and independent component analysis, are carefully introduced. Then the proposed new method and its algorithm are derived and presented. This part also includes consistency and breakdown point results for a general case of sparse ICS-like methods. The performance of SICS in identifying sparse independent component loadings is investigated with multiple simulations. The method is illustrated with an example in constructing sparse causal graphs and we also propose a graphical tool for selecting the appropriate sparsity level in SICS.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105587"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subgroup effect quantile regression with high dimensional missing panel data 高维缺失面板数据的亚组效应分位数回归
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-23 DOI: 10.1016/j.jmva.2025.105593
Shu-Yu Li, Han-Ying Liang
Based on panel data, we explore partially linear varying-coefficient quantile regression with group effects under high dimension and missing observations. Using generalized estimating equations, we construct oracle estimators along with smoothed version for the unknown parameter vector, varying-coefficient functions as well as group effects, and establish their asymptotic normality. In the estimation procedure, the within-subject correlations of the panel data are considered by introducing working correlation matrix. We further investigate variable selection by the SCAD penalty for the parameters, varying-coefficient functions and group identification simultaneously, and discuss oracle properties. Meanwhile, hypothesis tests for the parameter, varying-coefficient functions and group effects are done, asymptotic distributions of the restricted estimators and test statistics under both the null and local alternative hypotheses are analyzed. Also, simulation study and real data analysis are conducted to evaluate the performance of the proposed methods.
基于面板数据,我们探索了在高维和缺失观测值下具有组效应的部分线性变系数分位数回归。利用广义估计方程,构造了未知参数向量、变系数函数和群效应的oracle估计量及其光滑版本,并建立了它们的渐近正态性。在估计过程中,通过引入工作相关矩阵来考虑面板数据的主体内相关性。我们进一步研究了SCAD惩罚对参数、变系数函数和组识别的变量选择,并讨论了oracle的性质。同时,对参数函数、变系数函数和群效应进行了假设检验,分析了零假设和局部备用假设下的限制估计量和检验统计量的渐近分布。通过仿真研究和实际数据分析,对所提方法的性能进行了评价。
{"title":"Subgroup effect quantile regression with high dimensional missing panel data","authors":"Shu-Yu Li,&nbsp;Han-Ying Liang","doi":"10.1016/j.jmva.2025.105593","DOIUrl":"10.1016/j.jmva.2025.105593","url":null,"abstract":"<div><div>Based on panel data, we explore partially linear varying-coefficient quantile regression with group effects under high dimension and missing observations. Using generalized estimating equations, we construct oracle estimators along with smoothed version for the unknown parameter vector, varying-coefficient functions as well as group effects, and establish their asymptotic normality. In the estimation procedure, the within-subject correlations of the panel data are considered by introducing working correlation matrix. We further investigate variable selection by the SCAD penalty for the parameters, varying-coefficient functions and group identification simultaneously, and discuss oracle properties. Meanwhile, hypothesis tests for the parameter, varying-coefficient functions and group effects are done, asymptotic distributions of the restricted estimators and test statistics under both the null and local alternative hypotheses are analyzed. Also, simulation study and real data analysis are conducted to evaluate the performance of the proposed methods.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105593"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-free feature screening for ultrahigh dimensional data with responses missing not at random 响应非随机缺失的超高维数据的无模型特征筛选
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2026-01-15 DOI: 10.1016/j.jmva.2026.105605
Yuliang Bai, Niansheng Tang
Feature screening method is an important tool for screening active features in ultrahigh dimensional data analysis. Existing feature screening methods mainly focus on the fully observed data or missing responses at random. But in many applied fields such as biomedicine, social science and epidemiological studies, responses might be subject to nonignorable missingness due to various reasons such as dropout. To this end, this paper proposes a new adjusted Spearman rank correlation to screen active features by incorporating the Spearman rank correlation and its conditional expectation in the presence of nonignorable missing responses. To circumvent the notorious identification problem, we introduce instrumental variables into the propensity score (PS) function, which is specified by a more general semiparametric regression model. A nonparametric imputation method is developed to estimate the adjusted Spearman rank correlation. The proposed method has several desirable merits. First, it is model-free. Second, it is robust to outliers, heavy tailed data and the misspecification of the PS function. Third, under some weaker regularity conditions than existing missing data literature, it has sure screening property and ranking consistency, and can well control the false discovery rate regardless of known or consistently estimated parameters in the PS function. Simulation studies and two real examples are used to investigate the performance of the proposed methodologies.
特征筛选法是超高维数据分析中筛选出活动特征的重要工具。现有的特征筛选方法主要是针对完全观测到的数据或随机缺失的响应。但在生物医学、社会科学和流行病学等许多应用领域,由于退学等各种原因,响应可能会出现不可忽视的缺失。为此,本文提出了一种新的调整后的Spearman秩相关,通过结合Spearman秩相关及其条件期望,在不可忽略的缺失响应存在的情况下筛选活跃特征。为了避免臭名昭著的识别问题,我们将工具变量引入倾向得分(PS)函数,该函数由更一般的半参数回归模型指定。提出了一种估计调整后Spearman秩相关的非参数插值方法。所提出的方法有几个可取的优点。首先,它是无模型的。其次,它对异常值、重尾数据和PS函数的错误规范具有鲁棒性。第三,与现有缺失数据文献相比,在一些较弱的正则性条件下,它具有一定的筛选性能和排序一致性,无论PS函数中参数已知还是估计一致,都能很好地控制错误发现率。通过仿真研究和两个实际实例来研究所提出方法的性能。
{"title":"Model-free feature screening for ultrahigh dimensional data with responses missing not at random","authors":"Yuliang Bai,&nbsp;Niansheng Tang","doi":"10.1016/j.jmva.2026.105605","DOIUrl":"10.1016/j.jmva.2026.105605","url":null,"abstract":"<div><div>Feature screening method is an important tool for screening active features in ultrahigh dimensional data analysis. Existing feature screening methods mainly focus on the fully observed data or missing responses at random. But in many applied fields such as biomedicine, social science and epidemiological studies, responses might be subject to nonignorable missingness due to various reasons such as dropout. To this end, this paper proposes a new adjusted Spearman rank correlation to screen active features by incorporating the Spearman rank correlation and its conditional expectation in the presence of nonignorable missing responses. To circumvent the notorious identification problem, we introduce instrumental variables into the propensity score (PS) function, which is specified by a more general semiparametric regression model. A nonparametric imputation method is developed to estimate the adjusted Spearman rank correlation. The proposed method has several desirable merits. First, it is model-free. Second, it is robust to outliers, heavy tailed data and the misspecification of the PS function. Third, under some weaker regularity conditions than existing missing data literature, it has sure screening property and ranking consistency, and can well control the false discovery rate regardless of known or consistently estimated parameters in the PS function. Simulation studies and two real examples are used to investigate the performance of the proposed methodologies.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105605"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric estimation from correlated copies of a drifted process 漂移过程相关副本的非参数估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2026-01-17 DOI: 10.1016/j.jmva.2026.105607
Nicolas Marie
This paper presents several situations leading to the observation of multiple correlated copies of a drifted process, and then non-asymptotic risk bounds are established on nonparametric estimators of the drift function b0 and its derivative. For drifted Gaussian processes with a regular enough covariance function, a sharper risk bound is established on the estimator of b0, and a model selection procedure is provided with theoretical guarantees.
本文给出了漂移过程存在多个相关副本的几种情况,然后在漂移函数b0及其导数的非参数估计量上建立了非渐近风险界。对于具有足够规则协方差函数的漂移高斯过程,在b0′估计量上建立了更清晰的风险界,为模型选择过程提供了理论保证。
{"title":"Nonparametric estimation from correlated copies of a drifted process","authors":"Nicolas Marie","doi":"10.1016/j.jmva.2026.105607","DOIUrl":"10.1016/j.jmva.2026.105607","url":null,"abstract":"<div><div>This paper presents several situations leading to the observation of multiple correlated copies of a drifted process, and then non-asymptotic risk bounds are established on nonparametric estimators of the drift function <span><math><msub><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> and its derivative. For drifted Gaussian processes with a regular enough covariance function, a sharper risk bound is established on the estimator of <span><math><msubsup><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow><mrow><mo>′</mo></mrow></msubsup></math></span>, and a model selection procedure is provided with theoretical guarantees.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105607"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiplier and empirical subsample bootstraps for maxima in high dimensional time series analysis 高维时间序列分析中最大值的乘子和经验子样本自举
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-16 DOI: 10.1016/j.jmva.2025.105579
Ruru Ma, Shibin Zhang
The distribution of maxima is crucial for simultaneous inference of high dimensional parameters. This paper focuses on the bootstrap approximation to the distribution of maxima in high dimensional time series analysis. We propose two novel approaches, the multiplier subsample bootstrap (MSB) and the empirical subsample bootstrap (ESB), to approximate the distribution of maxima constructed from high dimensional time series. Both approaches utilize block-based subsample statistics to build the bootstrap statistics. The MSB assigns weights to block-based subsample statistics using random elements, while the ESB resamples from them independently and uniformly. Under certain regularity conditions, we establish the asymptotic validity of the two proposed approaches, when the parameter dimension is large or even much larger than the sample size. A simulation study demonstrates that both the MSB and ESB perform well.
极大值的分布对于高维参数的同时推断是至关重要的。研究了高维时间序列分析中极大值分布的自举逼近方法。我们提出了两种新的方法,乘子样本bootstrap (MSB)和经验子样本bootstrap (ESB),以近似高维时间序列构造的最大值的分布。这两种方法都利用基于块的子样本统计来构建自举统计。MSB使用随机元素为基于块的子样本统计信息分配权重,而ESB则独立且统一地从中重新采样。在一定的正则性条件下,当参数维数大于甚至远远大于样本容量时,我们建立了这两种方法的渐近有效性。仿真研究表明,MSB和ESB都具有良好的性能。
{"title":"Multiplier and empirical subsample bootstraps for maxima in high dimensional time series analysis","authors":"Ruru Ma,&nbsp;Shibin Zhang","doi":"10.1016/j.jmva.2025.105579","DOIUrl":"10.1016/j.jmva.2025.105579","url":null,"abstract":"<div><div>The distribution of maxima is crucial for simultaneous inference of high dimensional parameters. This paper focuses on the bootstrap approximation to the distribution of maxima in high dimensional time series analysis. We propose two novel approaches, the multiplier subsample bootstrap (MSB) and the empirical subsample bootstrap (ESB), to approximate the distribution of maxima constructed from high dimensional time series. Both approaches utilize block-based subsample statistics to build the bootstrap statistics. The MSB assigns weights to block-based subsample statistics using random elements, while the ESB resamples from them independently and uniformly. Under certain regularity conditions, we establish the asymptotic validity of the two proposed approaches, when the parameter dimension is large or even much larger than the sample size. A simulation study demonstrates that both the MSB and ESB perform well.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105579"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Envelope-based partial least squares in functional regression 函数回归中基于包络的偏最小二乘
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-23 DOI: 10.1016/j.jmva.2025.105592
Minxuan Wu , Joseph Antonelli , Zhihua Su
In this article, we extend predictor envelope models to settings with multivariate outcomes and multiple, functional predictors. We propose a two-step estimation strategy, which first projects the function onto a finite-dimensional Euclidean space before fitting the model using existing approaches to envelope models. We first develop an estimator under a linear model with continuous outcomes and then extend this procedure to the more general class of generalized linear models, which allow for a variety of outcome types. We provide asymptotic theory for these estimators showing that they are root-n consistent and asymptotically normal when the regression coefficient is finite-rank. Additionally we show that consistency can be obtained even when the regression coefficient has rank that grows with the sample size. Extensive simulation studies confirm our theoretical results and show strong prediction performance of the proposed estimators. Additionally, we provide multiple data analyses showing that the proposed approach performs well in real-world settings under a variety of outcome types compared with existing dimension reduction approaches.
在本文中,我们将预测包络模型扩展到具有多变量结果和多个功能预测因子的设置。我们提出了一种两步估计策略,首先将函数投影到有限维欧几里得空间,然后使用现有的包络模型方法拟合模型。我们首先在具有连续结果的线性模型下开发一个估计量,然后将此过程扩展到更一般的广义线性模型类,它允许各种结果类型。我们给出了这些估计量的渐近理论,表明当回归系数为有限秩时,它们是根n一致且渐近正态的。此外,我们还表明,即使回归系数的秩随着样本量的增加而增加,也可以获得一致性。大量的模拟研究证实了我们的理论结果,并显示了所提出的估计器的强大预测性能。此外,我们提供的多个数据分析表明,与现有的降维方法相比,所提出的方法在各种结果类型的现实环境中表现良好。
{"title":"Envelope-based partial least squares in functional regression","authors":"Minxuan Wu ,&nbsp;Joseph Antonelli ,&nbsp;Zhihua Su","doi":"10.1016/j.jmva.2025.105592","DOIUrl":"10.1016/j.jmva.2025.105592","url":null,"abstract":"<div><div>In this article, we extend predictor envelope models to settings with multivariate outcomes and multiple, functional predictors. We propose a two-step estimation strategy, which first projects the function onto a finite-dimensional Euclidean space before fitting the model using existing approaches to envelope models. We first develop an estimator under a linear model with continuous outcomes and then extend this procedure to the more general class of generalized linear models, which allow for a variety of outcome types. We provide asymptotic theory for these estimators showing that they are root-<span><math><mi>n</mi></math></span> consistent and asymptotically normal when the regression coefficient is finite-rank. Additionally we show that consistency can be obtained even when the regression coefficient has rank that grows with the sample size. Extensive simulation studies confirm our theoretical results and show strong prediction performance of the proposed estimators. Additionally, we provide multiple data analyses showing that the proposed approach performs well in real-world settings under a variety of outcome types compared with existing dimension reduction approaches.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105592"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous heterogeneity and reduced-rank learning for multivariate response regression 多元反应回归的同时异质性与降阶学习
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-05 DOI: 10.1016/j.jmva.2025.105578
Jie Wu , Bo Zhang , Daoji Li , Zemin Zheng
Heterogeneous data are now ubiquitous in many applications in which correctly identifying the subgroups from a heterogeneous population is critical. Although there is an increasing body of literature on subgroup detection, existing methods mainly focus on the univariate response setting. In this paper, we propose a joint heterogeneity and reduced-rank learning framework to simultaneously identify the subgroup structure and estimate the covariate effects for heterogeneous multivariate response regression. In particular, our approach uses rank-constrained pairwise fusion penalization and conducts the subgroup analysis without requiring prior knowledge regarding the individual subgroup memberships. We implement the proposed approach by an alternating direction method of multipliers (ADMM) algorithm and show its convergence. We also establish the asymptotic properties for the resulting estimators under mild and interpretable conditions. A predictive information criterion is proposed to select the rank of the coefficient matrix with theoretical support. The effectiveness of the proposed approach is demonstrated through simulation studies and a real data application.
异构数据现在在许多应用程序中无处不在,在这些应用程序中,从异构种群中正确识别子组是至关重要的。虽然亚群检测的文献越来越多,但现有的方法主要集中在单变量响应设置上。在本文中,我们提出了一个联合异质性和降阶学习框架,以同时识别子群结构和估计异质性多元响应回归的协变量效应。特别是,我们的方法使用了等级约束的两两融合惩罚,并在不需要关于单个子组成员的先验知识的情况下进行子组分析。我们用交替方向乘法器(ADMM)算法实现了该方法,并证明了其收敛性。在温和和可解释的条件下,我们还建立了所得估计量的渐近性质。提出了一种选择系数矩阵秩的预测信息准则,并提供了理论支持。仿真研究和实际数据应用验证了该方法的有效性。
{"title":"Simultaneous heterogeneity and reduced-rank learning for multivariate response regression","authors":"Jie Wu ,&nbsp;Bo Zhang ,&nbsp;Daoji Li ,&nbsp;Zemin Zheng","doi":"10.1016/j.jmva.2025.105578","DOIUrl":"10.1016/j.jmva.2025.105578","url":null,"abstract":"<div><div>Heterogeneous data are now ubiquitous in many applications in which correctly identifying the subgroups from a heterogeneous population is critical. Although there is an increasing body of literature on subgroup detection, existing methods mainly focus on the univariate response setting. In this paper, we propose a joint heterogeneity and reduced-rank learning framework to simultaneously identify the subgroup structure and estimate the covariate effects for heterogeneous multivariate response regression. In particular, our approach uses rank-constrained pairwise fusion penalization and conducts the subgroup analysis without requiring prior knowledge regarding the individual subgroup memberships. We implement the proposed approach by an alternating direction method of multipliers (ADMM) algorithm and show its convergence. We also establish the asymptotic properties for the resulting estimators under mild and interpretable conditions. A predictive information criterion is proposed to select the rank of the coefficient matrix with theoretical support. The effectiveness of the proposed approach is demonstrated through simulation studies and a real data application.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105578"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145683762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust factor analysis with exponential squared loss 具有指数平方损失的稳健因子分析
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-01 DOI: 10.1016/j.jmva.2025.105567
Jiaqi Hu, Tingyin Wang, Xueqin Wang
The large dimensional factor model, aimed at reducing dimensionality and extracting features through a few latent common factors, has sparked significant interest due to its broad applications. Despite the popularity of traditional methods for factor models, they may yield incorrect estimators for heavy-tailed data. To address this issue, we introduce the exponential squared loss to the factor model in this study, denoted as the Robust Exponential Factor Analysis (REFA). We propose a modified rank minimization technique to enhance the estimation accuracy of factor numbers in finite-sample cases. Consistency properties for factors and loadings are established under mild conditions, without any moment assumptions on the errors. The performance of REFA with finite samples under both light and heavy-tailed cases has been demonstrated through simulation studies. Furthermore, an analysis employing a financial dataset of asset returns underscores the superiority of REFA. To facilitate the implementation of our proposed methodology by researchers, we have developed an R package named REFA, which is available on CRAN.
大维度因子模型旨在通过几个潜在的共同因子降维并提取特征,由于其广泛的应用而引起了人们的极大兴趣。尽管因子模型的传统方法很受欢迎,但它们可能对重尾数据产生不正确的估计。为了解决这个问题,我们在本研究中将指数平方损失引入因子模型,称为稳健指数因子分析(REFA)。为了提高有限样本情况下因子数的估计精度,提出了一种改进的秩最小化技术。因子和载荷的一致性特性是在温和的条件下建立的,没有对误差的任何力矩假设。通过仿真研究,验证了该方法在轻尾和重尾两种情况下的性能。此外,采用资产回报金融数据集的分析强调了REFA的优越性。为了便于研究人员实施我们提出的方法,我们开发了一个名为REFA的R包,可在CRAN上获得。
{"title":"Robust factor analysis with exponential squared loss","authors":"Jiaqi Hu,&nbsp;Tingyin Wang,&nbsp;Xueqin Wang","doi":"10.1016/j.jmva.2025.105567","DOIUrl":"10.1016/j.jmva.2025.105567","url":null,"abstract":"<div><div>The large dimensional factor model, aimed at reducing dimensionality and extracting features through a few latent common factors, has sparked significant interest due to its broad applications. Despite the popularity of traditional methods for factor models, they may yield incorrect estimators for heavy-tailed data. To address this issue, we introduce the exponential squared loss to the factor model in this study, denoted as the Robust Exponential Factor Analysis (REFA). We propose a modified rank minimization technique to enhance the estimation accuracy of factor numbers in finite-sample cases. Consistency properties for factors and loadings are established under mild conditions, without any moment assumptions on the errors. The performance of REFA with finite samples under both light and heavy-tailed cases has been demonstrated through simulation studies. Furthermore, an analysis employing a financial dataset of asset returns underscores the superiority of REFA. To facilitate the implementation of our proposed methodology by researchers, we have developed an <span>R</span> package named <span>REFA</span>, which is available on <span>CRAN</span>.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105567"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145683763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low dimensional factor model-based tests for assessing vector correlation in high-dimensional settings 用于评估高维设置中向量相关性的基于低维因素模型的测试
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-17 DOI: 10.1016/j.jmva.2025.105588
Masashi Hyodo , Takahiro Nishiyama , Shoichi Narita
This study proposes a new test for vector correlation in a high-dimensional framework, while accommodating a low-dimensional latent factor model. Our test, built under low-dimensional factor models, distinguishes from previous normal approximation-based tests, which are valid under a weak-spike structure. We propose a modified RV coefficient for high-dimensional data, and show that its null-limiting distributions follow a weighted mixture of chi-square distributions under a high-dimensional asymptotic regime integrated with weak technical conditions. By applying this asymptotic result and estimation theory of the number of factors in a low-dimensional factor model, we propose a new approximation test for vector correlations. We also derive the asymptotic power function for the proposed test. Lastly, we examine the finite sample and dimensional performance of this test using Monte Carlo simulations.
本研究提出了一个新的测试在高维框架下的向量相关性,同时适应低维潜在因素模型。我们的测试是在低维因子模型下构建的,与之前基于正常近似的测试不同,这些测试在弱尖峰结构下有效。我们提出了一个改进的高维数据的RV系数,并证明了它的零极限分布遵循一个加权的混合卡方分布,在高维渐近状态下与弱技术条件相结合。利用这一渐近结果和低维因子模型中因子数的估计理论,我们提出了一种新的向量相关性的近似检验方法。我们也推导了渐近幂函数。最后,我们使用蒙特卡罗模拟来检验该测试的有限样本和维度性能。
{"title":"Low dimensional factor model-based tests for assessing vector correlation in high-dimensional settings","authors":"Masashi Hyodo ,&nbsp;Takahiro Nishiyama ,&nbsp;Shoichi Narita","doi":"10.1016/j.jmva.2025.105588","DOIUrl":"10.1016/j.jmva.2025.105588","url":null,"abstract":"<div><div>This study proposes a new test for vector correlation in a high-dimensional framework, while accommodating a low-dimensional latent factor model. Our test, built under low-dimensional factor models, distinguishes from previous normal approximation-based tests, which are valid under a weak-spike structure. We propose a modified RV coefficient for high-dimensional data, and show that its null-limiting distributions follow a weighted mixture of chi-square distributions under a high-dimensional asymptotic regime integrated with weak technical conditions. By applying this asymptotic result and estimation theory of the number of factors in a low-dimensional factor model, we propose a new approximation test for vector correlations. We also derive the asymptotic power function for the proposed test. Lastly, we examine the finite sample and dimensional performance of this test using Monte Carlo simulations.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105588"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A sparse dimension-reduced subspace-based approach for detecting multiple change points in high-dimensional data 一种基于稀疏降维子空间的高维数据多变化点检测方法
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-05-01 Epub Date: 2025-12-27 DOI: 10.1016/j.jmva.2025.105594
Luoyao Yu , Rongzhu Zhao , Jiaqi Huang , Lixing Zhu , Xuehu Zhu
This paper develops a novel penalized matrix estimation method for sparse dimension reduction when detecting change points in high-dimensional data. The strategy is to project high-dimensional data onto a low-dimensional subspace without losing any change point information, enabling efficient change point detection within this dimension-reduced subspace. Theoretical analysis establishes the consistency of the proposed matrix estimation and selects consistently the important variables which have change points. Numerical studies on synthetic and several real data sets suggest that the dimension reduction strategy enhances the performance of existing approaches. Additionally, the results showcase the efficiency of the proposed algorithm for selecting important variables in high-dimensional sparse data.
针对高维数据中变化点的稀疏降维问题,提出了一种新的惩罚矩阵估计方法。该策略是将高维数据投影到低维子空间中,而不会丢失任何更改点信息,从而在该降维子空间中实现有效的更改点检测。理论分析建立了所提出的矩阵估计的一致性,并一致地选择了具有变化点的重要变量。对合成数据集和一些实际数据集的数值研究表明,降维策略提高了现有方法的性能。此外,实验结果表明了该算法在高维稀疏数据中选取重要变量的有效性。
{"title":"A sparse dimension-reduced subspace-based approach for detecting multiple change points in high-dimensional data","authors":"Luoyao Yu ,&nbsp;Rongzhu Zhao ,&nbsp;Jiaqi Huang ,&nbsp;Lixing Zhu ,&nbsp;Xuehu Zhu","doi":"10.1016/j.jmva.2025.105594","DOIUrl":"10.1016/j.jmva.2025.105594","url":null,"abstract":"<div><div>This paper develops a novel penalized matrix estimation method for sparse dimension reduction when detecting change points in high-dimensional data. The strategy is to project high-dimensional data onto a low-dimensional subspace without losing any change point information, enabling efficient change point detection within this dimension-reduced subspace. Theoretical analysis establishes the consistency of the proposed matrix estimation and selects consistently the important variables which have change points. Numerical studies on synthetic and several real data sets suggest that the dimension reduction strategy enhances the performance of existing approaches. Additionally, the results showcase the efficiency of the proposed algorithm for selecting important variables in high-dimensional sparse data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105594"},"PeriodicalIF":1.4,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Multivariate Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1