首页 > 最新文献

Journal of Multivariate Analysis最新文献

英文 中文
Statistical guarantees for distribution estimation of contaminated data via DNN-based MoM-GANs 基于dnn的mom - gan对污染数据分布估计的统计保证
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105571
Fang Xie , Lihu Xu , Qiuran Yao , Huiming Zhang
This paper investigates the distribution estimation of contaminated data using the MoM-GAN method, which leverages the power of generative adversarial nets (GANs) and median-of-means (MoM) estimation. Specifically, we use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. In terms of theoretical analysis, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator, which is measured by integral probability metrics and takes into account the b-smoothness Hölder class. The error bound essentially decreases in nb/pn1/2, where n and p are the sample size and the dimension of the input data, respectively. It provides a rigorous guarantee of the accuracy and robustness of the MoM-GAN estimator, even in the presence of contaminated data. We present an algorithm for the MoM-GAN method and demonstrate its effectiveness in two real-world applications. Our results show that the MoM-GAN method outperforms other competing methods when dealing with contaminated data, highlighting its superior performance and robustness.
本文利用生成对抗网络(gan)和均值中位数(MoM)估计的力量,研究了使用MoM- gan方法对污染数据的分布估计。具体来说,我们使用具有ReLU激活函数的深度神经网络(DNN)来建模GAN的生成器和鉴别器。在理论分析方面,我们推导了基于dnn的MoM-GAN估计器的非渐近误差界,该估计器通过积分概率度量来测量,并考虑了b-平滑Hölder类。误差界本质上在n−b/p中∨n−1/2中减小,其中n和p分别是输入数据的样本量和维数。它为MoM-GAN估计器的准确性和鲁棒性提供了严格的保证,即使在存在污染数据的情况下。我们提出了一种MoM-GAN方法的算法,并在两个实际应用中证明了它的有效性。我们的结果表明,MoM-GAN方法在处理污染数据时优于其他竞争方法,突出了其优越的性能和鲁棒性。
{"title":"Statistical guarantees for distribution estimation of contaminated data via DNN-based MoM-GANs","authors":"Fang Xie ,&nbsp;Lihu Xu ,&nbsp;Qiuran Yao ,&nbsp;Huiming Zhang","doi":"10.1016/j.jmva.2025.105571","DOIUrl":"10.1016/j.jmva.2025.105571","url":null,"abstract":"<div><div>This paper investigates the distribution estimation of contaminated data using the MoM-GAN method, which leverages the power of generative adversarial nets (GANs) and median-of-means (MoM) estimation. Specifically, we use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. In terms of theoretical analysis, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator, which is measured by integral probability metrics and takes into account the <span><math><mi>b</mi></math></span>-smoothness Hölder class. The error bound essentially decreases in <span><math><mrow><msup><mrow><mi>n</mi></mrow><mrow><mo>−</mo><mi>b</mi><mo>/</mo><mi>p</mi></mrow></msup><mo>∨</mo><msup><mrow><mi>n</mi></mrow><mrow><mo>−</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup></mrow></math></span>, where <span><math><mi>n</mi></math></span> and <span><math><mi>p</mi></math></span> are the sample size and the dimension of the input data, respectively. It provides a rigorous guarantee of the accuracy and robustness of the MoM-GAN estimator, even in the presence of contaminated data. We present an algorithm for the MoM-GAN method and demonstrate its effectiveness in two real-world applications. Our results show that the MoM-GAN method outperforms other competing methods when dealing with contaminated data, highlighting its superior performance and robustness.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105571"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A latent space model for link prediction in statistical citation network 统计引文网络中链接预测的潜在空间模型
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105555
Rui Pan , Yuan Gao , Hansheng Wang
Link prediction is of vital importance in network analysis. In this work, we propose a novel latent space model for link prediction in a statistical citation network. Specifically, the model can incorporate the transitivity information of both the citation network and the author-paper network. In addition, nodal features are also taken into consideration and the pseudo maximum likelihood estimation of the corresponding parameter is developed. The asymptotic consistency is established and demonstrated through extensive simulation studies. Link prediction is then performed and the performance is compared among different methods. At last, a real citation network of statistics is analyzed.
链路预测在网络分析中起着至关重要的作用。在这项工作中,我们提出了一种新的潜在空间模型用于统计引用网络中的链接预测。具体来说,该模型可以同时包含引文网络和作者-论文网络的及物性信息。此外,还考虑了节点特征,提出了相应参数的伪极大似然估计。通过大量的仿真研究,建立并证明了渐近一致性。然后进行链路预测,并比较不同方法的性能。最后,对一个真实的统计引文网络进行了分析。
{"title":"A latent space model for link prediction in statistical citation network","authors":"Rui Pan ,&nbsp;Yuan Gao ,&nbsp;Hansheng Wang","doi":"10.1016/j.jmva.2025.105555","DOIUrl":"10.1016/j.jmva.2025.105555","url":null,"abstract":"<div><div>Link prediction is of vital importance in network analysis. In this work, we propose a novel latent space model for link prediction in a statistical citation network. Specifically, the model can incorporate the transitivity information of both the citation network and the author-paper network. In addition, nodal features are also taken into consideration and the pseudo maximum likelihood estimation of the corresponding parameter is developed. The asymptotic consistency is established and demonstrated through extensive simulation studies. Link prediction is then performed and the performance is compared among different methods. At last, a real citation network of statistics is analyzed.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105555"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A latent factor model for high-dimensional binary data 高维二值数据的潜在因子模型
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105554
Jiaxin Shi , Yuan Gao , Rui Pan , Hansheng Wang
In this study, we develop a latent factor model for analyzing high-dimensional binary data. Specifically, a standard probit model is used to describe the regression relationship between the observed binary data and the continuous latent variables. Our method assumes that the dependency structure of the observed binary data can be fully captured by the continuous latent factors. To estimate the model, a moment-based estimation method is developed. The proposed method is able to deal with both discontinuity and high dimensionality. Most importantly, the asymptotic properties of the resulting estimators are rigorously established. Extensive simulation studies are presented to demonstrate the proposed methodology. A real dataset about product descriptions is analyzed for illustration.
在这项研究中,我们建立了一个潜在因素模型来分析高维二进制数据。具体来说,使用标准probit模型来描述观测到的二值数据与连续潜变量之间的回归关系。我们的方法假设观察到的二进制数据的依赖结构可以被连续的潜在因素完全捕获。为了对模型进行估计,提出了一种基于矩的估计方法。该方法既能处理不连续问题,又能处理高维问题。最重要的是,所得到的估计量的渐近性质得到了严格的建立。广泛的模拟研究提出了证明所提出的方法。对一个真实的产品描述数据集进行了分析。
{"title":"A latent factor model for high-dimensional binary data","authors":"Jiaxin Shi ,&nbsp;Yuan Gao ,&nbsp;Rui Pan ,&nbsp;Hansheng Wang","doi":"10.1016/j.jmva.2025.105554","DOIUrl":"10.1016/j.jmva.2025.105554","url":null,"abstract":"<div><div>In this study, we develop a latent factor model for analyzing high-dimensional binary data. Specifically, a standard probit model is used to describe the regression relationship between the observed binary data and the continuous latent variables. Our method assumes that the dependency structure of the observed binary data can be fully captured by the continuous latent factors. To estimate the model, a moment-based estimation method is developed. The proposed method is able to deal with both discontinuity and high dimensionality. Most importantly, the asymptotic properties of the resulting estimators are rigorously established. Extensive simulation studies are presented to demonstrate the proposed methodology. A real dataset about product descriptions is analyzed for illustration.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105554"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145616471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing and measuring the conditional mean (in)dependence for functional data by martingale difference-angle divergence 用鞅差角散度检验和测量函数数据的条件均值(in)依赖性
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105573
Tingyu Lai , Yingying Wang , Zhongzhan Zhang
We proposed a new nonparametric method to test and measure conditional mean (in)dependence for functional data. This new metric has some appealing properties: it is nonnegative and equals to zero if and only if the conditional mean independence holds; it is invariant under linear transformations of the predictor; it does not require the moment condition for the predictor variable. Based on this measure, two test procedures for conditional mean independence are proposed for functional data. One uses a wild bootstrap while the other uses the limiting standard normal distribution. The tests are consistent and perform well in finite sample simulations. We further propose some requirements for a reasonable conditional mean dependence measure and demonstrate that our metric has those properties. A real data example is introduced to illustrate the application of the proposed method.
我们提出了一种新的非参数方法来检验和测量函数数据的条件平均(in)依赖性。这个新度量有一些吸引人的性质:它是非负的,当且仅当条件平均独立性成立时等于零;它在预测器的线性变换下是不变的;它不需要预测变量的力矩条件。在此基础上,对功能数据提出了条件均值独立性的两种检验方法。一个使用野生自举,而另一个使用极限标准正态分布。实验结果一致,在有限样本模拟中表现良好。我们进一步提出了合理的条件均值依赖度量的一些要求,并证明了我们的度量具有这些性质。通过一个实际数据实例说明了该方法的应用。
{"title":"Testing and measuring the conditional mean (in)dependence for functional data by martingale difference-angle divergence","authors":"Tingyu Lai ,&nbsp;Yingying Wang ,&nbsp;Zhongzhan Zhang","doi":"10.1016/j.jmva.2025.105573","DOIUrl":"10.1016/j.jmva.2025.105573","url":null,"abstract":"<div><div>We proposed a new nonparametric method to test and measure conditional mean (in)dependence for functional data. This new metric has some appealing properties: it is nonnegative and equals to zero if and only if the conditional mean independence holds; it is invariant under linear transformations of the predictor; it does not require the moment condition for the predictor variable. Based on this measure, two test procedures for conditional mean independence are proposed for functional data. One uses a wild bootstrap while the other uses the limiting standard normal distribution. The tests are consistent and perform well in finite sample simulations. We further propose some requirements for a reasonable conditional mean dependence measure and demonstrate that our metric has those properties. A real data example is introduced to illustrate the application of the proposed method.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105573"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian analysis of nonlinear structured latent factor models with a Gaussian process prior 具有高斯过程先验的非线性结构化潜在因素模型的贝叶斯分析
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105577
Yimang Zhang , Xiaorui Wang , Jian Qing Shi
Factor analysis models are widely used in social and behavioral sciences, such as psychology, education, and marketing, to measure unobservable latent traits. In this article, we introduce a nonlinear structured latent factor analysis model that is more flexible in characterizing the relationship between manifest variables and latent factors. The confirmatory identifiability of the latent factor is discussed, ensuring the substantive interpretation of these latent factors. A Bayesian approach with a Gaussian process prior is proposed to estimate the unknown nonlinear function and the unknown parameters. Asymptotic results are established, including the structured identifiability of latent factors, as well as the consistency of estimates for the unknown parameters and the unknown nonlinear function. Simulation studies and real data analysis are conducted to evaluate the performance of the proposed method. The simulation results demonstrate that our proposed method performs well in handling nonlinear model and successfully identifies the latent factors. Additionally, the analysis of oil flow data reveals the underlying structure of latent nonlinear patterns.
因子分析模型广泛应用于社会和行为科学,如心理学、教育和市场营销,以测量不可观察的潜在特征。在本文中,我们引入了一种非线性结构化的潜在因素分析模型,该模型在表征显性变量与潜在因素之间的关系方面更为灵活。讨论了潜在因素的确认性,确保了这些潜在因素的实质性解释。提出了一种具有高斯过程先验的贝叶斯方法来估计未知非线性函数和未知参数。建立了渐近结果,包括潜在因素的结构可辨识性,以及未知参数和未知非线性函数估计的一致性。通过仿真研究和实际数据分析来评估该方法的性能。仿真结果表明,该方法能较好地处理非线性模型,并能成功识别潜在因素。此外,对油流数据的分析揭示了潜在非线性模式的潜在结构。
{"title":"Bayesian analysis of nonlinear structured latent factor models with a Gaussian process prior","authors":"Yimang Zhang ,&nbsp;Xiaorui Wang ,&nbsp;Jian Qing Shi","doi":"10.1016/j.jmva.2025.105577","DOIUrl":"10.1016/j.jmva.2025.105577","url":null,"abstract":"<div><div>Factor analysis models are widely used in social and behavioral sciences, such as psychology, education, and marketing, to measure unobservable latent traits. In this article, we introduce a nonlinear structured latent factor analysis model that is more flexible in characterizing the relationship between manifest variables and latent factors. The confirmatory identifiability of the latent factor is discussed, ensuring the substantive interpretation of these latent factors. A Bayesian approach with a Gaussian process prior is proposed to estimate the unknown nonlinear function and the unknown parameters. Asymptotic results are established, including the structured identifiability of latent factors, as well as the consistency of estimates for the unknown parameters and the unknown nonlinear function. Simulation studies and real data analysis are conducted to evaluate the performance of the proposed method. The simulation results demonstrate that our proposed method performs well in handling nonlinear model and successfully identifies the latent factors. Additionally, the analysis of oil flow data reveals the underlying structure of latent nonlinear patterns.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105577"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On convergence of regularized covariance estimator based on modified Cholesky decomposition 基于修正Cholesky分解的正则化协方差估计的收敛性
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105553
Yuli Liang , Deliang Dai , Shaobo Jin
The regularization for covariance matrix is a widely used technique when estimating large covariance matrices. This paper examines a penalized likelihood method for constructing a statistically efficient covariance matrix estimator. Modified Cholesky decomposition (MCD) is used to parameterize the covariance matrix and the effective regularization scheme is achieved by combining both shrinkage and smoothing penalties on the Cholesky factor. The practical performance is at odds with an absence of theoretical properties of the derived estimators in the literature. In this work, we aim to fill the gap between theory and practice by establishing the convergence properties under regularity conditions. We also provide a simulation study as numerical illustrations.
协方差矩阵的正则化是估计大协方差矩阵时广泛使用的一种技术。本文研究了一种惩罚似然法来构造统计有效的协方差矩阵估计量。采用改进的Cholesky分解(MCD)对协方差矩阵进行参数化,并结合Cholesky因子的收缩和平滑惩罚实现有效的正则化方案。实际性能与文献中导出的估计器的理论性质的缺乏不一致。在这项工作中,我们旨在通过建立正则条件下的收敛性来填补理论与实践之间的差距。我们还提供了一个仿真研究作为数值说明。
{"title":"On convergence of regularized covariance estimator based on modified Cholesky decomposition","authors":"Yuli Liang ,&nbsp;Deliang Dai ,&nbsp;Shaobo Jin","doi":"10.1016/j.jmva.2025.105553","DOIUrl":"10.1016/j.jmva.2025.105553","url":null,"abstract":"<div><div>The regularization for covariance matrix is a widely used technique when estimating large covariance matrices. This paper examines a penalized likelihood method for constructing a statistically efficient covariance matrix estimator. Modified Cholesky decomposition (MCD) is used to parameterize the covariance matrix and the effective regularization scheme is achieved by combining both shrinkage and smoothing penalties on the Cholesky factor. The practical performance is at odds with an absence of theoretical properties of the derived estimators in the literature. In this work, we aim to fill the gap between theory and practice by establishing the convergence properties under regularity conditions. We also provide a simulation study as numerical illustrations.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105553"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Limiting spectral distribution of high-dimensional integrated covariance matrices based on high-frequency data with multiple transactions 基于多事务高频数据的高维积分协方差矩阵的谱限分布
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105568
Moming Wang , Ningning Xia , Yong Zhou
Due to the heavy trading volume in financial markets and the limitations of recording mechanisms, the occurrence of multiple transactions during each recording period is a common feature of high-frequency data. This paper investigates how the number of such multiple transactions impacts the behavior of an averaged version of time-variation adjusted realized covariance (ATVA) matrix in a high-dimensional situation, where the number of stocks and the observation frequency go to infinity proportionally. By using random matrix theory, we derive the limiting spectral distribution (LSD) of ATVA matrices based on high-frequency multiple observations. We demonstrate how the LSD of ATVA matrices depends on the number of multiple transactions. The study of the LSD of random matrices is not only theoretically interesting in itself but also provides a better insight into the pre-averaging approach, which is widely used to deal with the microstructure noise. Furthermore, we investigate the limits of spiked eigenvalues of ATVA matrices when the covariance matrix of asset prices exhibits a spiked pattern. Finally, the theoretical results are supported by simulation studies.
由于金融市场的大交易量和记录机制的限制,高频数据在每个记录周期内发生多笔交易是高频数据的共同特征。本文研究了在股票数量和观察频率成比例趋近无穷大的高维情况下,这类多重交易的数量如何影响时变调整已实现协方差(ATVA)矩阵的平均版本的行为。利用随机矩阵理论,推导了基于高频多次观测的ATVA矩阵的极限谱分布。我们演示了ATVA矩阵的LSD如何依赖于多个事务的数量。对随机矩阵的LSD的研究不仅在理论上很有意义,而且对广泛应用于微观结构噪声处理的预平均方法提供了更好的理解。此外,我们研究了当资产价格的协方差矩阵呈现出尖峰模式时,ATVA矩阵的尖峰特征值的极限。最后,通过仿真研究对理论结果进行了验证。
{"title":"Limiting spectral distribution of high-dimensional integrated covariance matrices based on high-frequency data with multiple transactions","authors":"Moming Wang ,&nbsp;Ningning Xia ,&nbsp;Yong Zhou","doi":"10.1016/j.jmva.2025.105568","DOIUrl":"10.1016/j.jmva.2025.105568","url":null,"abstract":"<div><div>Due to the heavy trading volume in financial markets and the limitations of recording mechanisms, the occurrence of multiple transactions during each recording period is a common feature of high-frequency data. This paper investigates how the number of such multiple transactions impacts the behavior of an averaged version of time-variation adjusted realized covariance (ATVA) matrix in a high-dimensional situation, where the number of stocks and the observation frequency go to infinity proportionally. By using random matrix theory, we derive the limiting spectral distribution (LSD) of ATVA matrices based on high-frequency multiple observations. We demonstrate how the LSD of ATVA matrices depends on the number of multiple transactions. The study of the LSD of random matrices is not only theoretically interesting in itself but also provides a better insight into the pre-averaging approach, which is widely used to deal with the microstructure noise. Furthermore, we investigate the limits of spiked eigenvalues of ATVA matrices when the covariance matrix of asset prices exhibits a spiked pattern. Finally, the theoretical results are supported by simulation studies.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105568"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Threshold models for high-dimensional time series with network structure 具有网络结构的高维时间序列阈值模型
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105560
Chi Tim Ng , Chun Yip Yau , Yuanbo Li , Lei Qin
Threshold autoregressive (TAR) models form an important class of nonlinear time series models and have attracted great attentions in the literature. In order to extend threshold modeling to high-dimensional nonlinear time series, a threshold network autoregressive (TNAR) model is proposed in this paper to overcome the difficulty of over-parameterization by exploiting the available information of network relations. The proposed model can characterize the regime-switching feature in nonlinear complex network systems. Sufficient conditions for the strict stationarity and the ergodicity of the TNAR model are established. A computationally efficient method based on group LASSO is developed to estimate the multiple thresholds and the parameters. A grouped TNAR model is also proposed to further reduce the number of the parameters. The asymptotic behavior of the proposed method is explored and the estimation consistency of both number of groups and group membership structure is established.
阈值自回归(TAR)模型是一类重要的非线性时间序列模型,一直受到文献的广泛关注。为了将阈值建模推广到高维非线性时间序列,利用网络关系的可用信息,提出了一种阈值网络自回归(TNAR)模型,克服了过度参数化的困难。该模型可以表征非线性复杂网络系统的状态切换特征。建立了TNAR模型的严格平稳性和遍历性的充分条件。提出了一种基于群LASSO的快速估计多阈值和参数的方法。为了进一步减少参数的数量,还提出了分组TNAR模型。研究了该方法的渐近性,建立了群数和群隶属结构的估计一致性。
{"title":"Threshold models for high-dimensional time series with network structure","authors":"Chi Tim Ng ,&nbsp;Chun Yip Yau ,&nbsp;Yuanbo Li ,&nbsp;Lei Qin","doi":"10.1016/j.jmva.2025.105560","DOIUrl":"10.1016/j.jmva.2025.105560","url":null,"abstract":"<div><div>Threshold autoregressive (TAR) models form an important class of nonlinear time series models and have attracted great attentions in the literature. In order to extend threshold modeling to high-dimensional nonlinear time series, a threshold network autoregressive (TNAR) model is proposed in this paper to overcome the difficulty of over-parameterization by exploiting the available information of network relations. The proposed model can characterize the regime-switching feature in nonlinear complex network systems. Sufficient conditions for the strict stationarity and the ergodicity of the TNAR model are established. A computationally efficient method based on group LASSO is developed to estimate the multiple thresholds and the parameters. A grouped TNAR model is also proposed to further reduce the number of the parameters. The asymptotic behavior of the proposed method is explored and the estimation consistency of both number of groups and group membership structure is established.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105560"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust bilinear factor analysis based on the matrix-variate t distribution 基于矩阵变量t分布的稳健双线性因子分析
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105575
Xuan Ma , Jianhua Zhao , Changchun Shang , Fen Jiang , Philip L.H. Yu
Factor Analysis based on the multivariate t distribution (tFA) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, tFA is only applicable to vector data. When tFA is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for tFA: (i) the inherent matrix structure of the data is broken, and (ii) robustness may be lost, as vectorized matrix data typically results in a high data dimension, which could easily lead to the breakdown of tFA. To address these issues, starting from the intrinsic matrix structure of matrix data, a novel robust factor analysis model, namely bilinear factor analysis built on the matrix-variate t distribution (tBFA), is proposed in this paper. The novelty is that it is capable of simultaneously extracting common factors for both row and column variables of interest on heavy-tailed or contaminated matrix data. Two efficient algorithms for maximum likelihood estimation of tBFA are developed. Closed-form expressions for the Fisher information matrix to calculate the accuracy of parameter estimates are derived. Empirical studies are conducted to understand the proposed tBFA model and compare it with related competitors. The results demonstrate the superiority and practicality of tBFA. Importantly, tBFA exhibits a significantly higher breakdown point than tFA, making it more suitable for matrix data.
基于多元t分布(tFA)的因子分析是提取重尾或污染数据的共同因子的一种有用的鲁棒工具。然而,tFA仅适用于矢量数据。当tFA应用于矩阵数据时,通常首先对矩阵观测值进行矢量化。这给tFA带来了两个挑战:(i)数据固有的矩阵结构被打破,(ii)鲁棒性可能会丧失,因为矢量化的矩阵数据通常会导致高数据维数,这很容易导致tFA的崩溃。为了解决这些问题,本文从矩阵数据的固有矩阵结构出发,提出了一种新的鲁棒因子分析模型,即基于矩阵变量t分布的双线性因子分析(tBFA)。新颖之处在于,它能够同时提取重尾或污染矩阵数据上感兴趣的行和列变量的公共因子。提出了两种有效的tBFA最大似然估计算法。导出了计算参数估计精度的费雪信息矩阵的封闭表达式。通过实证研究来理解所提出的tBFA模型,并将其与相关竞争对手进行比较。结果表明了tBFA的优越性和实用性。重要的是,tBFA比tFA具有更高的击穿点,使其更适合于矩阵数据。
{"title":"Robust bilinear factor analysis based on the matrix-variate t distribution","authors":"Xuan Ma ,&nbsp;Jianhua Zhao ,&nbsp;Changchun Shang ,&nbsp;Fen Jiang ,&nbsp;Philip L.H. Yu","doi":"10.1016/j.jmva.2025.105575","DOIUrl":"10.1016/j.jmva.2025.105575","url":null,"abstract":"<div><div>Factor Analysis based on the multivariate <em>t</em> distribution (<em>t</em>FA) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, <em>t</em>FA is only applicable to vector data. When <em>t</em>FA is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for <em>t</em>FA: (i) the inherent matrix structure of the data is broken, and (ii) robustness may be lost, as vectorized matrix data typically results in a high data dimension, which could easily lead to the breakdown of <em>t</em>FA. To address these issues, starting from the intrinsic matrix structure of matrix data, a novel robust factor analysis model, namely bilinear factor analysis built on the matrix-variate <em>t</em> distribution (<em>t</em>BFA), is proposed in this paper. The novelty is that it is capable of simultaneously extracting common factors for both row and column variables of interest on heavy-tailed or contaminated matrix data. Two efficient algorithms for maximum likelihood estimation of <em>t</em>BFA are developed. Closed-form expressions for the Fisher information matrix to calculate the accuracy of parameter estimates are derived. Empirical studies are conducted to understand the proposed <em>t</em>BFA model and compare it with related competitors. The results demonstrate the superiority and practicality of <em>t</em>BFA. Importantly, <em>t</em>BFA exhibits a significantly higher breakdown point than <em>t</em>FA, making it more suitable for matrix data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105575"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of tensor factor model by iterative least squares 张量因子模型的迭代最小二乘估计
IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-11-28 DOI: 10.1016/j.jmva.2025.105557
Yong He , Yujie Hou , Yalin Wang , Wen-Xin Zhou
For large-dimensional tensor time series, dimension reduction plays a pivotal role. Tensor factor model depicts tensor-valued time series through a low-dimensional projection on a space of common factors, thereby achieving great dimension reduction and having a wide range of applications in economics and finance. In this paper, we propose a simple iterative least squares algorithm for estimating tensor factor model. We first estimate the latent common factors by using deterministic mode-k projection matrices and then estimate the loading matrices by minimizing the squared Frobenius loss function under certain identifiability conditions. The estimated loading matrices are further taken as new mode-k projection matrices, and the above update procedures are iteratively executed until convergence. We also propose a novel eigenvalue ratio method for estimating the number of factors and show the consistency of the estimators. Given the true number of factors, we theoretically establish the convergence rates of the estimated loading matrices and signal components at the sth iteration for any s1. Thorough numerical studies are conducted to investigate the finite-sample performance of the proposed method. Analyses of import-export transport networks and lung cancer histopathological image datasets illustrate the empirical usefulness of the proposed method.
对于大维张量时间序列,降维起着至关重要的作用。张量因子模型通过在公共因子空间上的低维投影来描述张量值时间序列,从而实现了极大的降维,在经济和金融领域有着广泛的应用。本文提出了一种简单的迭代最小二乘算法来估计张量因子模型。首先利用确定性模式k投影矩阵估计潜在的公共因子,然后在一定的可辨识性条件下,通过最小化Frobenius损失函数的平方来估计加载矩阵。将估计的加载矩阵作为新的k型投影矩阵,迭代执行上述更新过程直至收敛。我们还提出了一种新的估计因子数量的特征值比方法,并证明了估计量的一致性。在给定因子的真实数目的情况下,我们从理论上建立了对任意s≥1的估计加载矩阵和信号分量在第5次迭代时的收敛速率。对该方法的有限样本性能进行了深入的数值研究。对进出口运输网络和肺癌组织病理学图像数据集的分析说明了所提出方法的经验有效性。
{"title":"Estimation of tensor factor model by iterative least squares","authors":"Yong He ,&nbsp;Yujie Hou ,&nbsp;Yalin Wang ,&nbsp;Wen-Xin Zhou","doi":"10.1016/j.jmva.2025.105557","DOIUrl":"10.1016/j.jmva.2025.105557","url":null,"abstract":"<div><div>For large-dimensional tensor time series, dimension reduction plays a pivotal role. Tensor factor model depicts tensor-valued time series through a low-dimensional projection on a space of common factors, thereby achieving great dimension reduction and having a wide range of applications in economics and finance. In this paper, we propose a simple iterative least squares algorithm for estimating tensor factor model. We first estimate the latent common factors by using deterministic mode-<span><math><mi>k</mi></math></span> projection matrices and then estimate the loading matrices by minimizing the squared Frobenius loss function under certain identifiability conditions. The estimated loading matrices are further taken as new mode-<span><math><mi>k</mi></math></span> projection matrices, and the above update procedures are iteratively executed until convergence. We also propose a novel eigenvalue ratio method for estimating the number of factors and show the consistency of the estimators. Given the true number of factors, we theoretically establish the convergence rates of the estimated loading matrices and signal components at the <span><math><mi>s</mi></math></span>th iteration for any <span><math><mrow><mi>s</mi><mo>≥</mo><mn>1</mn></mrow></math></span>. Thorough numerical studies are conducted to investigate the finite-sample performance of the proposed method. Analyses of import-export transport networks and lung cancer histopathological image datasets illustrate the empirical usefulness of the proposed method.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"212 ","pages":"Article 105557"},"PeriodicalIF":1.4,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145616472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Multivariate Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1