首页 > 最新文献

Journal of Multivariate Analysis最新文献

英文 中文
Global tests for detecting change in mean vector functions of multivariate functional data with repeated observations 用重复观测检测多元函数数据的平均向量函数变化的全局检验
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-27 DOI: 10.1016/j.jmva.2026.105615
Zhiping Qiu , Wei Lin , Xiaming Tu , Jin-Ting Zhang
In many scientific and technological fields, multivariate functional data are often repeatedly observed under varying conditions over time. A fundamental question is whether the mean vector function remains consistently equal throughout the entire period. This paper introduces two novel global testing statistics that leverage integration technique to address this issue. The asymptotic distributions of the proposed test statistics under the null hypothesis are derived, and their root-n consistency is established. Simulation studies are conducted to evaluate the numerical performance of the proposed tests, which are further illustrated through an analysis of publicly available EEG motion data.
在许多科学和技术领域中,多元函数数据经常在不同的条件下随时间重复观察。一个基本的问题是,在整个周期内,平均向量函数是否始终保持相等。本文介绍了两种利用集成技术来解决这个问题的新的全局测试统计。给出了在零假设下检验统计量的渐近分布,并建立了它们的根n相合性。仿真研究进行了评估所提出的测试的数值性能,这是通过分析公开可用的脑电图运动数据进一步说明。
{"title":"Global tests for detecting change in mean vector functions of multivariate functional data with repeated observations","authors":"Zhiping Qiu ,&nbsp;Wei Lin ,&nbsp;Xiaming Tu ,&nbsp;Jin-Ting Zhang","doi":"10.1016/j.jmva.2026.105615","DOIUrl":"10.1016/j.jmva.2026.105615","url":null,"abstract":"<div><div>In many scientific and technological fields, multivariate functional data are often repeatedly observed under varying conditions over time. A fundamental question is whether the mean vector function remains consistently equal throughout the entire period. This paper introduces two novel global testing statistics that leverage integration technique to address this issue. The asymptotic distributions of the proposed test statistics under the null hypothesis are derived, and their root-<span><math><mi>n</mi></math></span> consistency is established. Simulation studies are conducted to evaluate the numerical performance of the proposed tests, which are further illustrated through an analysis of publicly available EEG motion data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"214 ","pages":"Article 105615"},"PeriodicalIF":1.4,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scalable model averaging based on Kullback–Leibler distance for multivariate regression models 多元回归模型中基于Kullback-Leibler距离的可伸缩模型平均
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-23 DOI: 10.1016/j.jmva.2026.105614
Jie Zeng , Guozhi Hu , Weihu Cheng
This paper considers estimation problem in multivariate regression models. Under this framework, we develop a novel two-stage model averaging procedure. In the first stage, we construct a scalable model averaging estimator which involves transforming the original model based on the singular value decomposition. When the dimension of the regressor vector is K, this approach enables us to average the estimators from the candidate model set of size K instead of size 2K. The second stage is to find the optimal weights for averaging by applying a weight choice criterion from Kullback–Leibler distance. We prove that the minimum weighted squared loss from the scalable model averaging is asymptotically the same as that from original model averaging, further demonstrate asymptotic optimality of the scalable model averaging estimator using Kullback–Leibler-distance-based weights, and derive the rate of the resulting weights tending to the risk-based optimal weights. In comparison with existing model averaging methods, the simulation results show that, in terms of weighted mean squared prediction error and computation time, our proposal is more efficient, especially under the situation where the number of candidate models is large and the sample size is small. Moreover, a real data analysis is provided to illustrate the application of our method in practice.
本文研究多元回归模型中的估计问题。在此框架下,我们开发了一种新的两阶段模型平均方法。在第一阶段,我们构造了一个可扩展的模型平均估计器,它涉及到基于奇异值分解的原始模型的转换。当回归向量的维度为K时,这种方法使我们能够平均大小为K而不是大小为2K的候选模型集的估计量。第二阶段是应用Kullback-Leibler距离的权值选择准则,求出最优的平均权值。我们证明了可扩展模型平均的最小加权平方损失与原始模型平均的最小加权平方损失是渐近相同的,进一步利用基于kullback - leibler -distance的权值证明了可扩展模型平均估计的渐近最优性,并推导了结果权值趋向于基于风险的最优权值的比率。仿真结果表明,与现有的模型平均方法相比,在加权均方预测误差和计算时间方面,我们的方法效率更高,特别是在候选模型数量大、样本量小的情况下。并通过一个实际的数据分析来说明本文方法在实际中的应用。
{"title":"A scalable model averaging based on Kullback–Leibler distance for multivariate regression models","authors":"Jie Zeng ,&nbsp;Guozhi Hu ,&nbsp;Weihu Cheng","doi":"10.1016/j.jmva.2026.105614","DOIUrl":"10.1016/j.jmva.2026.105614","url":null,"abstract":"<div><div>This paper considers estimation problem in multivariate regression models. Under this framework, we develop a novel two-stage model averaging procedure. In the first stage, we construct a scalable model averaging estimator which involves transforming the original model based on the singular value decomposition. When the dimension of the regressor vector is <span><math><mi>K</mi></math></span>, this approach enables us to average the estimators from the candidate model set of size <span><math><mi>K</mi></math></span> instead of size <span><math><msup><mrow><mn>2</mn></mrow><mrow><mi>K</mi></mrow></msup></math></span>. The second stage is to find the optimal weights for averaging by applying a weight choice criterion from Kullback–Leibler distance. We prove that the minimum weighted squared loss from the scalable model averaging is asymptotically the same as that from original model averaging, further demonstrate asymptotic optimality of the scalable model averaging estimator using Kullback–Leibler-distance-based weights, and derive the rate of the resulting weights tending to the risk-based optimal weights. In comparison with existing model averaging methods, the simulation results show that, in terms of weighted mean squared prediction error and computation time, our proposal is more efficient, especially under the situation where the number of candidate models is large and the sample size is small. Moreover, a real data analysis is provided to illustrate the application of our method in practice.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"214 ","pages":"Article 105614"},"PeriodicalIF":1.4,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On consistent estimation of dimension values 关于维度值的一致估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-23 DOI: 10.1016/j.jmva.2025.105591
Alejandro Cholaquidis , Antonio Cuevas , Beatriz Pateiro-López
The problem of estimating, from a random sample of points, the dimension of a compact subset S of the Euclidean space is considered. The emphasis is put on consistency results in the statistical sense. That is, statements of convergence to the true dimension value when the sample size grows to infinity. Among the many available definitions of dimension, we have focused (on the grounds of its statistical tractability) on three notions: the Minkowski dimension, the correlation dimension and the, perhaps less popular, concept of pointwise dimension. We prove the statistical consistency of some natural estimators of these quantities. Our proofs partially rely on the use of an instrumental estimator formulated in terms of the empirical volume function Vn(r), defined as the Lebesgue measure of the set of points whose distance to the sample is at most r. In particular, we explore the case in which the true volume function V(r) of the target set S is a polynomial on some interval starting at zero. An empirical study is also included. Our study aims to provide some theoretical support, and some practical insights, for the problem of deciding whether or not the set S has a dimension smaller than that of the ambient space. This is a major statistical motivation of the dimension studies, in connection with the so-called “Manifold Hypothesis”.
考虑了从随机点样本中估计欧几里德空间的紧子集S的维数的问题。重点放在统计意义上的一致性结果上。即当样本量增长到无穷大时收敛于真实维度值的表述。在许多可用的维度定义中,我们(基于其统计可追溯性)关注了三个概念:闵可夫斯基维度、相关维度和可能不太流行的点向维度概念。我们证明了这些量的一些自然估计的统计相合性。我们的证明部分依赖于用经验体积函数Vn(r)表示的工具估计量的使用,Vn(r)定义为与样本距离最多为r的点集的勒贝格测度。特别是,我们探讨了目标集S的真实体积函数V(r)在开始于0的某个区间上是多项式的情况。并进行了实证研究。我们的研究旨在为确定集合S的维数是否小于环境空间的维数的问题提供一些理论支持和一些实践见解。这是维度研究的主要统计动机,与所谓的“流形假说”有关。
{"title":"On consistent estimation of dimension values","authors":"Alejandro Cholaquidis ,&nbsp;Antonio Cuevas ,&nbsp;Beatriz Pateiro-López","doi":"10.1016/j.jmva.2025.105591","DOIUrl":"10.1016/j.jmva.2025.105591","url":null,"abstract":"<div><div>The problem of estimating, from a random sample of points, the dimension of a compact subset <span><math><mi>S</mi></math></span> of the Euclidean space is considered. The emphasis is put on consistency results in the statistical sense. That is, statements of convergence to the true dimension value when the sample size grows to infinity. Among the many available definitions of dimension, we have focused (on the grounds of its statistical tractability) on three notions: the Minkowski dimension, the correlation dimension and the, perhaps less popular, concept of pointwise dimension. We prove the statistical consistency of some natural estimators of these quantities. Our proofs partially rely on the use of an instrumental estimator formulated in terms of the empirical volume function <span><math><mrow><msub><mrow><mi>V</mi></mrow><mrow><mi>n</mi></mrow></msub><mrow><mo>(</mo><mi>r</mi><mo>)</mo></mrow></mrow></math></span>, defined as the Lebesgue measure of the set of points whose distance to the sample is at most <span><math><mi>r</mi></math></span>. In particular, we explore the case in which the true volume function <span><math><mrow><mi>V</mi><mrow><mo>(</mo><mi>r</mi><mo>)</mo></mrow></mrow></math></span> of the target set <span><math><mi>S</mi></math></span> is a polynomial on some interval starting at zero. An empirical study is also included. Our study aims to provide some theoretical support, and some practical insights, for the problem of deciding whether or not the set <span><math><mi>S</mi></math></span> has a dimension smaller than that of the ambient space. This is a major statistical motivation of the dimension studies, in connection with the so-called “Manifold Hypothesis”.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"214 ","pages":"Article 105591"},"PeriodicalIF":1.4,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric estimation from correlated copies of a drifted process 漂移过程相关副本的非参数估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-17 DOI: 10.1016/j.jmva.2026.105607
Nicolas Marie
This paper presents several situations leading to the observation of multiple correlated copies of a drifted process, and then non-asymptotic risk bounds are established on nonparametric estimators of the drift function b0 and its derivative. For drifted Gaussian processes with a regular enough covariance function, a sharper risk bound is established on the estimator of b0, and a model selection procedure is provided with theoretical guarantees.
本文给出了漂移过程存在多个相关副本的几种情况,然后在漂移函数b0及其导数的非参数估计量上建立了非渐近风险界。对于具有足够规则协方差函数的漂移高斯过程,在b0′估计量上建立了更清晰的风险界,为模型选择过程提供了理论保证。
{"title":"Nonparametric estimation from correlated copies of a drifted process","authors":"Nicolas Marie","doi":"10.1016/j.jmva.2026.105607","DOIUrl":"10.1016/j.jmva.2026.105607","url":null,"abstract":"<div><div>This paper presents several situations leading to the observation of multiple correlated copies of a drifted process, and then non-asymptotic risk bounds are established on nonparametric estimators of the drift function <span><math><msub><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> and its derivative. For drifted Gaussian processes with a regular enough covariance function, a sharper risk bound is established on the estimator of <span><math><msubsup><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow><mrow><mo>′</mo></mrow></msubsup></math></span>, and a model selection procedure is provided with theoretical guarantees.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105607"},"PeriodicalIF":1.4,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uniform knockoff filter for high-dimensional controlled graph recovery 用于高维受控图恢复的均匀仿制品滤波器
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-16 DOI: 10.1016/j.jmva.2026.105606
Jia Zhou , Yang Li , Zemin Zheng , Changchun Tan
Reproducible learning of high-dimensional graphical structures is fundamentally important in numerous contemporary applications, as it visually reveals the underlying conditional dependencies among complex network data. In this paper, we introduce a novel procedure called the uniform graphical knockoff filter, which controls the overall false discovery rate (FDR) in Gaussian graph recovery by utilizing knockoff variables and a uniform threshold. Compared to existing methods, it is more robust to varying levels of sparsity in the true graph. We provide theoretical justifications for the procedure, demonstrating that the FDR can be asymptotically controlled and that the power is asymptotically one under mild conditions. Extensive numerical studies confirm the robust and competitive finite-sample performance of the proposed method.
高维图形结构的可重复学习在许多当代应用中是至关重要的,因为它直观地揭示了复杂网络数据之间潜在的条件依赖性。在本文中,我们引入了一种新的过程,称为均匀图形仿冒滤波器,它利用仿冒变量和均匀阈值控制高斯图恢复中的总体错误发现率(FDR)。与现有方法相比,该方法对真实图的不同稀疏度具有更好的鲁棒性。我们为这一过程提供了理论依据,证明了FDR可以被渐近控制,并且在温和的条件下,权力是渐近的。大量的数值研究证实了该方法的鲁棒性和竞争性有限样本性能。
{"title":"Uniform knockoff filter for high-dimensional controlled graph recovery","authors":"Jia Zhou ,&nbsp;Yang Li ,&nbsp;Zemin Zheng ,&nbsp;Changchun Tan","doi":"10.1016/j.jmva.2026.105606","DOIUrl":"10.1016/j.jmva.2026.105606","url":null,"abstract":"<div><div>Reproducible learning of high-dimensional graphical structures is fundamentally important in numerous contemporary applications, as it visually reveals the underlying conditional dependencies among complex network data. In this paper, we introduce a novel procedure called the uniform graphical knockoff filter, which controls the overall false discovery rate (FDR) in Gaussian graph recovery by utilizing knockoff variables and a uniform threshold. Compared to existing methods, it is more robust to varying levels of sparsity in the true graph. We provide theoretical justifications for the procedure, demonstrating that the FDR can be asymptotically controlled and that the power is asymptotically one under mild conditions. Extensive numerical studies confirm the robust and competitive finite-sample performance of the proposed method.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"214 ","pages":"Article 105606"},"PeriodicalIF":1.4,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-free feature screening for ultrahigh dimensional data with responses missing not at random 响应非随机缺失的超高维数据的无模型特征筛选
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2026-01-15 DOI: 10.1016/j.jmva.2026.105605
Yuliang Bai, Niansheng Tang
Feature screening method is an important tool for screening active features in ultrahigh dimensional data analysis. Existing feature screening methods mainly focus on the fully observed data or missing responses at random. But in many applied fields such as biomedicine, social science and epidemiological studies, responses might be subject to nonignorable missingness due to various reasons such as dropout. To this end, this paper proposes a new adjusted Spearman rank correlation to screen active features by incorporating the Spearman rank correlation and its conditional expectation in the presence of nonignorable missing responses. To circumvent the notorious identification problem, we introduce instrumental variables into the propensity score (PS) function, which is specified by a more general semiparametric regression model. A nonparametric imputation method is developed to estimate the adjusted Spearman rank correlation. The proposed method has several desirable merits. First, it is model-free. Second, it is robust to outliers, heavy tailed data and the misspecification of the PS function. Third, under some weaker regularity conditions than existing missing data literature, it has sure screening property and ranking consistency, and can well control the false discovery rate regardless of known or consistently estimated parameters in the PS function. Simulation studies and two real examples are used to investigate the performance of the proposed methodologies.
特征筛选法是超高维数据分析中筛选出活动特征的重要工具。现有的特征筛选方法主要是针对完全观测到的数据或随机缺失的响应。但在生物医学、社会科学和流行病学等许多应用领域,由于退学等各种原因,响应可能会出现不可忽视的缺失。为此,本文提出了一种新的调整后的Spearman秩相关,通过结合Spearman秩相关及其条件期望,在不可忽略的缺失响应存在的情况下筛选活跃特征。为了避免臭名昭著的识别问题,我们将工具变量引入倾向得分(PS)函数,该函数由更一般的半参数回归模型指定。提出了一种估计调整后Spearman秩相关的非参数插值方法。所提出的方法有几个可取的优点。首先,它是无模型的。其次,它对异常值、重尾数据和PS函数的错误规范具有鲁棒性。第三,与现有缺失数据文献相比,在一些较弱的正则性条件下,它具有一定的筛选性能和排序一致性,无论PS函数中参数已知还是估计一致,都能很好地控制错误发现率。通过仿真研究和两个实际实例来研究所提出方法的性能。
{"title":"Model-free feature screening for ultrahigh dimensional data with responses missing not at random","authors":"Yuliang Bai,&nbsp;Niansheng Tang","doi":"10.1016/j.jmva.2026.105605","DOIUrl":"10.1016/j.jmva.2026.105605","url":null,"abstract":"<div><div>Feature screening method is an important tool for screening active features in ultrahigh dimensional data analysis. Existing feature screening methods mainly focus on the fully observed data or missing responses at random. But in many applied fields such as biomedicine, social science and epidemiological studies, responses might be subject to nonignorable missingness due to various reasons such as dropout. To this end, this paper proposes a new adjusted Spearman rank correlation to screen active features by incorporating the Spearman rank correlation and its conditional expectation in the presence of nonignorable missing responses. To circumvent the notorious identification problem, we introduce instrumental variables into the propensity score (PS) function, which is specified by a more general semiparametric regression model. A nonparametric imputation method is developed to estimate the adjusted Spearman rank correlation. The proposed method has several desirable merits. First, it is model-free. Second, it is robust to outliers, heavy tailed data and the misspecification of the PS function. Third, under some weaker regularity conditions than existing missing data literature, it has sure screening property and ranking consistency, and can well control the false discovery rate regardless of known or consistently estimated parameters in the PS function. Simulation studies and two real examples are used to investigate the performance of the proposed methodologies.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105605"},"PeriodicalIF":1.4,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A sparse dimension-reduced subspace-based approach for detecting multiple change points in high-dimensional data 一种基于稀疏降维子空间的高维数据多变化点检测方法
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-12-27 DOI: 10.1016/j.jmva.2025.105594
Luoyao Yu , Rongzhu Zhao , Jiaqi Huang , Lixing Zhu , Xuehu Zhu
This paper develops a novel penalized matrix estimation method for sparse dimension reduction when detecting change points in high-dimensional data. The strategy is to project high-dimensional data onto a low-dimensional subspace without losing any change point information, enabling efficient change point detection within this dimension-reduced subspace. Theoretical analysis establishes the consistency of the proposed matrix estimation and selects consistently the important variables which have change points. Numerical studies on synthetic and several real data sets suggest that the dimension reduction strategy enhances the performance of existing approaches. Additionally, the results showcase the efficiency of the proposed algorithm for selecting important variables in high-dimensional sparse data.
针对高维数据中变化点的稀疏降维问题,提出了一种新的惩罚矩阵估计方法。该策略是将高维数据投影到低维子空间中,而不会丢失任何更改点信息,从而在该降维子空间中实现有效的更改点检测。理论分析建立了所提出的矩阵估计的一致性,并一致地选择了具有变化点的重要变量。对合成数据集和一些实际数据集的数值研究表明,降维策略提高了现有方法的性能。此外,实验结果表明了该算法在高维稀疏数据中选取重要变量的有效性。
{"title":"A sparse dimension-reduced subspace-based approach for detecting multiple change points in high-dimensional data","authors":"Luoyao Yu ,&nbsp;Rongzhu Zhao ,&nbsp;Jiaqi Huang ,&nbsp;Lixing Zhu ,&nbsp;Xuehu Zhu","doi":"10.1016/j.jmva.2025.105594","DOIUrl":"10.1016/j.jmva.2025.105594","url":null,"abstract":"<div><div>This paper develops a novel penalized matrix estimation method for sparse dimension reduction when detecting change points in high-dimensional data. The strategy is to project high-dimensional data onto a low-dimensional subspace without losing any change point information, enabling efficient change point detection within this dimension-reduced subspace. Theoretical analysis establishes the consistency of the proposed matrix estimation and selects consistently the important variables which have change points. Numerical studies on synthetic and several real data sets suggest that the dimension reduction strategy enhances the performance of existing approaches. Additionally, the results showcase the efficiency of the proposed algorithm for selecting important variables in high-dimensional sparse data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105594"},"PeriodicalIF":1.4,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subgroup effect quantile regression with high dimensional missing panel data 高维缺失面板数据的亚组效应分位数回归
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-12-23 DOI: 10.1016/j.jmva.2025.105593
Shu-Yu Li, Han-Ying Liang
Based on panel data, we explore partially linear varying-coefficient quantile regression with group effects under high dimension and missing observations. Using generalized estimating equations, we construct oracle estimators along with smoothed version for the unknown parameter vector, varying-coefficient functions as well as group effects, and establish their asymptotic normality. In the estimation procedure, the within-subject correlations of the panel data are considered by introducing working correlation matrix. We further investigate variable selection by the SCAD penalty for the parameters, varying-coefficient functions and group identification simultaneously, and discuss oracle properties. Meanwhile, hypothesis tests for the parameter, varying-coefficient functions and group effects are done, asymptotic distributions of the restricted estimators and test statistics under both the null and local alternative hypotheses are analyzed. Also, simulation study and real data analysis are conducted to evaluate the performance of the proposed methods.
基于面板数据,我们探索了在高维和缺失观测值下具有组效应的部分线性变系数分位数回归。利用广义估计方程,构造了未知参数向量、变系数函数和群效应的oracle估计量及其光滑版本,并建立了它们的渐近正态性。在估计过程中,通过引入工作相关矩阵来考虑面板数据的主体内相关性。我们进一步研究了SCAD惩罚对参数、变系数函数和组识别的变量选择,并讨论了oracle的性质。同时,对参数函数、变系数函数和群效应进行了假设检验,分析了零假设和局部备用假设下的限制估计量和检验统计量的渐近分布。通过仿真研究和实际数据分析,对所提方法的性能进行了评价。
{"title":"Subgroup effect quantile regression with high dimensional missing panel data","authors":"Shu-Yu Li,&nbsp;Han-Ying Liang","doi":"10.1016/j.jmva.2025.105593","DOIUrl":"10.1016/j.jmva.2025.105593","url":null,"abstract":"<div><div>Based on panel data, we explore partially linear varying-coefficient quantile regression with group effects under high dimension and missing observations. Using generalized estimating equations, we construct oracle estimators along with smoothed version for the unknown parameter vector, varying-coefficient functions as well as group effects, and establish their asymptotic normality. In the estimation procedure, the within-subject correlations of the panel data are considered by introducing working correlation matrix. We further investigate variable selection by the SCAD penalty for the parameters, varying-coefficient functions and group identification simultaneously, and discuss oracle properties. Meanwhile, hypothesis tests for the parameter, varying-coefficient functions and group effects are done, asymptotic distributions of the restricted estimators and test statistics under both the null and local alternative hypotheses are analyzed. Also, simulation study and real data analysis are conducted to evaluate the performance of the proposed methods.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105593"},"PeriodicalIF":1.4,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Envelope-based partial least squares in functional regression 函数回归中基于包络的偏最小二乘
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-12-23 DOI: 10.1016/j.jmva.2025.105592
Minxuan Wu , Joseph Antonelli , Zhihua Su
In this article, we extend predictor envelope models to settings with multivariate outcomes and multiple, functional predictors. We propose a two-step estimation strategy, which first projects the function onto a finite-dimensional Euclidean space before fitting the model using existing approaches to envelope models. We first develop an estimator under a linear model with continuous outcomes and then extend this procedure to the more general class of generalized linear models, which allow for a variety of outcome types. We provide asymptotic theory for these estimators showing that they are root-n consistent and asymptotically normal when the regression coefficient is finite-rank. Additionally we show that consistency can be obtained even when the regression coefficient has rank that grows with the sample size. Extensive simulation studies confirm our theoretical results and show strong prediction performance of the proposed estimators. Additionally, we provide multiple data analyses showing that the proposed approach performs well in real-world settings under a variety of outcome types compared with existing dimension reduction approaches.
在本文中,我们将预测包络模型扩展到具有多变量结果和多个功能预测因子的设置。我们提出了一种两步估计策略,首先将函数投影到有限维欧几里得空间,然后使用现有的包络模型方法拟合模型。我们首先在具有连续结果的线性模型下开发一个估计量,然后将此过程扩展到更一般的广义线性模型类,它允许各种结果类型。我们给出了这些估计量的渐近理论,表明当回归系数为有限秩时,它们是根n一致且渐近正态的。此外,我们还表明,即使回归系数的秩随着样本量的增加而增加,也可以获得一致性。大量的模拟研究证实了我们的理论结果,并显示了所提出的估计器的强大预测性能。此外,我们提供的多个数据分析表明,与现有的降维方法相比,所提出的方法在各种结果类型的现实环境中表现良好。
{"title":"Envelope-based partial least squares in functional regression","authors":"Minxuan Wu ,&nbsp;Joseph Antonelli ,&nbsp;Zhihua Su","doi":"10.1016/j.jmva.2025.105592","DOIUrl":"10.1016/j.jmva.2025.105592","url":null,"abstract":"<div><div>In this article, we extend predictor envelope models to settings with multivariate outcomes and multiple, functional predictors. We propose a two-step estimation strategy, which first projects the function onto a finite-dimensional Euclidean space before fitting the model using existing approaches to envelope models. We first develop an estimator under a linear model with continuous outcomes and then extend this procedure to the more general class of generalized linear models, which allow for a variety of outcome types. We provide asymptotic theory for these estimators showing that they are root-<span><math><mi>n</mi></math></span> consistent and asymptotically normal when the regression coefficient is finite-rank. Additionally we show that consistency can be obtained even when the regression coefficient has rank that grows with the sample size. Extensive simulation studies confirm our theoretical results and show strong prediction performance of the proposed estimators. Additionally, we provide multiple data analyses showing that the proposed approach performs well in real-world settings under a variety of outcome types compared with existing dimension reduction approaches.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105592"},"PeriodicalIF":1.4,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic sparse estimation of the high-dimensional cross-covariance matrix 高维交叉协方差矩阵的自动稀疏估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-12-19 DOI: 10.1016/j.jmva.2025.105590
Tetsuya Umino , Kazuyoshi Yata , Makoto Aoshima
Scenarios involving high-dimensional, low-sample-size (HDLSS) data are often encountered in modern scientific fields involving genetic microarrays, medical imaging, and finance, where the number of variables can greatly exceed the number of observations. In such settings, a reliable estimation of cross-covariance structures is essential for understanding relationships between variable sets. However, classical estimators often exhibit severe noise accumulation. To address this issue, in this study, we propose a novel thresholding estimator of the cross-covariance matrix for HDLSS settings. We consider the asymptotic properties of the sample cross-covariance matrix and show that the estimator contains large amounts of noise in the high-dimensional setting, which renders it inconsistent. To solve this problem occurring in high-dimensional settings, we develop a new thresholding estimator based on the automatic sparse estimation methodology and show that the estimator is consistent under mild assumptions. We analyze and evaluate the performance of the proposed estimator based on numerical simulations and actual data analysis. The simulations demonstrate that the method attains consistency without requiring the stringent high-dimensional conditions assumed by existing approaches, and the real-data analysis illustrates its applicability to high-dimensional regression problems, wherein improved parameter estimation enhances prediction accuracy. In conclusion, our findings serve as a theoretically sound tool for cross-covariance estimation in HDLSS contexts, with potential implications for a wide range of high-dimensional data analyses.
涉及高维、低样本大小(HDLSS)数据的场景经常在涉及基因微阵列、医学成像和金融的现代科学领域中遇到,其中变量的数量可能大大超过观察的数量。在这种情况下,对交叉协方差结构的可靠估计对于理解变量集之间的关系至关重要。然而,经典估计器往往表现出严重的噪声积累。为了解决这个问题,在本研究中,我们提出了一种新的HDLSS设置的交叉协方差矩阵阈值估计器。我们考虑样本交叉协方差矩阵的渐近性质,并表明估计量在高维设置中包含大量的噪声,这使得它不一致。为了解决高维环境中出现的这一问题,我们开发了一种新的基于自动稀疏估计方法的阈值估计器,并证明了该估计器在温和假设下是一致的。基于数值模拟和实际数据分析,对所提估计器的性能进行了分析和评价。仿真结果表明,该方法不需要现有方法所假定的严格的高维条件,即可达到一致性;实际数据分析表明,该方法适用于高维回归问题,其中改进的参数估计提高了预测精度。总之,我们的研究结果可以作为HDLSS背景下交叉协方差估计的理论可靠工具,对广泛的高维数据分析具有潜在的影响。
{"title":"Automatic sparse estimation of the high-dimensional cross-covariance matrix","authors":"Tetsuya Umino ,&nbsp;Kazuyoshi Yata ,&nbsp;Makoto Aoshima","doi":"10.1016/j.jmva.2025.105590","DOIUrl":"10.1016/j.jmva.2025.105590","url":null,"abstract":"<div><div>Scenarios involving high-dimensional, low-sample-size (HDLSS) data are often encountered in modern scientific fields involving genetic microarrays, medical imaging, and finance, where the number of variables can greatly exceed the number of observations. In such settings, a reliable estimation of cross-covariance structures is essential for understanding relationships between variable sets. However, classical estimators often exhibit severe noise accumulation. To address this issue, in this study, we propose a novel thresholding estimator of the cross-covariance matrix for HDLSS settings. We consider the asymptotic properties of the sample cross-covariance matrix and show that the estimator contains large amounts of noise in the high-dimensional setting, which renders it inconsistent. To solve this problem occurring in high-dimensional settings, we develop a new thresholding estimator based on the automatic sparse estimation methodology and show that the estimator is consistent under mild assumptions. We analyze and evaluate the performance of the proposed estimator based on numerical simulations and actual data analysis. The simulations demonstrate that the method attains consistency without requiring the stringent high-dimensional conditions assumed by existing approaches, and the real-data analysis illustrates its applicability to high-dimensional regression problems, wherein improved parameter estimation enhances prediction accuracy. In conclusion, our findings serve as a theoretically sound tool for cross-covariance estimation in HDLSS contexts, with potential implications for a wide range of high-dimensional data analyses.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"213 ","pages":"Article 105590"},"PeriodicalIF":1.4,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Multivariate Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1