首页 > 最新文献

Journal of Multivariate Analysis最新文献

英文 中文
Enhanced Laplace approximation 增强拉普拉斯近似
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-26 DOI: 10.1016/j.jmva.2024.105321
Jeongseop Han, Youngjo Lee

The Laplace approximation has been proposed as a method for approximating the marginal likelihood of statistical models with latent variables. However, the approximate maximum likelihood estimators derived from the Laplace approximation are often biased for binary or temporally and/or spatially correlated data. Additionally, the corresponding Hessian matrix tends to underestimates the standard errors of these approximate maximum likelihood estimators. While higher-order approximations have been suggested, they are not applicable to complex models, such as correlated random effects models, and fail to provide consistent variance estimators. In this paper, we propose an enhanced Laplace approximation that provides the true maximum likelihood estimator and its consistent variance estimator. We study its relationship with the variational Bayes method. We also define a new restricted maximum likelihood estimator for estimating dispersion parameters and study their asymptotic properties. Enhanced Laplace approximation generally demonstrates how to obtain the true restricted maximum likelihood estimators and their variance estimators. Our numerical studies indicate that the enhanced Laplace approximation provides a satisfactory maximum likelihood estimator and restricted maximum likelihood estimator, as well as their variance estimators in the frequentist perspective. The maximum likelihood estimator and restricted maximum likelihood estimator can be also interpreted as the posterior mode and marginal posterior mode under flat priors, respectively. Furthermore, we present some comparisons with Bayesian procedures under different priors.

拉普拉斯近似法是一种用于近似潜在变量统计模型边际似然的方法。然而,对于二元数据或时间和/或空间相关数据,从拉普拉斯近似法得出的近似极大似然估计值往往存在偏差。此外,相应的 Hessian 矩阵往往会低估这些近似极大似然估计值的标准误差。虽然有人提出了更高阶的近似值,但它们不适用于复杂的模型,如相关随机效应模型,也不能提供一致的方差估计值。在本文中,我们提出了一种增强的拉普拉斯近似方法,它能提供真正的最大似然估计值及其一致的方差估计值。我们研究了它与变异贝叶斯方法的关系。我们还定义了用于估计离散参数的新的受限最大似然估计器,并研究了其渐近特性。增强拉普拉斯近似一般展示了如何获得真正的受限极大似然估计器及其方差估计器。我们的数值研究表明,增强拉普拉斯近似提供了一个令人满意的最大似然估计器和受限最大似然估计器,以及频繁主义视角下的它们的方差估计器。最大似然估计和受限最大似然估计也可以分别解释为平面先验下的后验模式和边际后验模式。此外,我们还对不同先验下的贝叶斯程序进行了比较。
{"title":"Enhanced Laplace approximation","authors":"Jeongseop Han,&nbsp;Youngjo Lee","doi":"10.1016/j.jmva.2024.105321","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105321","url":null,"abstract":"<div><p>The Laplace approximation has been proposed as a method for approximating the marginal likelihood of statistical models with latent variables. However, the approximate maximum likelihood estimators derived from the Laplace approximation are often biased for binary or temporally and/or spatially correlated data. Additionally, the corresponding Hessian matrix tends to underestimates the standard errors of these approximate maximum likelihood estimators. While higher-order approximations have been suggested, they are not applicable to complex models, such as correlated random effects models, and fail to provide consistent variance estimators. In this paper, we propose an enhanced Laplace approximation that provides the true maximum likelihood estimator and its consistent variance estimator. We study its relationship with the variational Bayes method. We also define a new restricted maximum likelihood estimator for estimating dispersion parameters and study their asymptotic properties. Enhanced Laplace approximation generally demonstrates how to obtain the true restricted maximum likelihood estimators and their variance estimators. Our numerical studies indicate that the enhanced Laplace approximation provides a satisfactory maximum likelihood estimator and restricted maximum likelihood estimator, as well as their variance estimators in the frequentist perspective. The maximum likelihood estimator and restricted maximum likelihood estimator can be also interpreted as the posterior mode and marginal posterior mode under flat priors, respectively. Furthermore, we present some comparisons with Bayesian procedures under different priors.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105321"},"PeriodicalIF":1.6,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140807251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate unified skew-t distributions and their properties 多变量统一偏斜-t 分布及其性质
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-26 DOI: 10.1016/j.jmva.2024.105322
Kesen Wang , Maicon J. Karling , Reinaldo B. Arellano-Valle , Marc G. Genton

The unified skew-t (SUT) is a flexible parametric multivariate distribution that accounts for skewness and heavy tails in the data. A few of its properties can be found scattered in the literature or in a parameterization that does not follow the original one for unified skew-normal (SUN) distributions, yet a systematic study is lacking. In this work, explicit properties of the multivariate SUT distribution are presented, such as its stochastic representations, moments, SUN-scale mixture representation, linear transformation, additivity, marginal distribution, canonical form, quadratic form, conditional distribution, change of latent dimensions, Mardia measures of multivariate skewness and kurtosis, and non-identifiability issue. These results are given in a parameterization that reduces to the original SUN distribution as a sub-model, hence facilitating the use of the SUT for applications. Several models based on the SUT distribution are provided for illustration.

统一偏斜正态分布(SUT)是一种灵活的参数多元分布,它考虑了数据的偏斜度和重尾。它的一些性质散见于文献或参数化中,与统一偏态正态分布(SUN)的原始参数化不同,但缺乏系统的研究。本研究提出了多元 SUT 分布的明确性质,如随机表示、矩、SUN 尺度混合表示、线性变换、可加性、边际分布、典型形式、二次形式、条件分布、潜维变化、多元偏度和峰度的 Mardia 度量以及不可识别性问题。这些结果以参数化的形式给出,可以还原为原始 SUN 分布的子模型,从而方便了 SUT 的应用。本文提供了几个基于 SUT 分布的模型以作说明。
{"title":"Multivariate unified skew-t distributions and their properties","authors":"Kesen Wang ,&nbsp;Maicon J. Karling ,&nbsp;Reinaldo B. Arellano-Valle ,&nbsp;Marc G. Genton","doi":"10.1016/j.jmva.2024.105322","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105322","url":null,"abstract":"<div><p>The unified skew-<span><math><mi>t</mi></math></span> (SUT) is a flexible parametric multivariate distribution that accounts for skewness and heavy tails in the data. A few of its properties can be found scattered in the literature or in a parameterization that does not follow the original one for unified skew-normal (SUN) distributions, yet a systematic study is lacking. In this work, explicit properties of the multivariate SUT distribution are presented, such as its stochastic representations, moments, SUN-scale mixture representation, linear transformation, additivity, marginal distribution, canonical form, quadratic form, conditional distribution, change of latent dimensions, Mardia measures of multivariate skewness and kurtosis, and non-identifiability issue. These results are given in a parameterization that reduces to the original SUN distribution as a sub-model, hence facilitating the use of the SUT for applications. Several models based on the SUT distribution are provided for illustration.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"203 ","pages":"Article 105322"},"PeriodicalIF":1.6,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing distributional equality for functional random variables 测试函数式随机变量的分布相等性
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-22 DOI: 10.1016/j.jmva.2024.105318
Bilol Banerjee

In this article, we present a nonparametric method for the general two-sample problem involving functional random variables modeled as elements of a separable Hilbert space H. First, we present a general recipe based on linear projections to construct a measure of dissimilarity between two probability distributions on H. In particular, we consider a measure based on the energy statistic and present some of its nice theoretical properties. A plug-in estimator of this measure is used as the test statistic to construct a general two-sample test. Large sample distribution of this statistic is derived both under null and alternative hypotheses. However, since the quantiles of the limiting null distribution are analytically intractable, the test is calibrated using the permutation method. We prove the large sample consistency of the resulting permutation test under fairly general assumptions. We also study the efficiency of the proposed test by establishing a new local asymptotic normality result for functional random variables. Using that result, we derive the asymptotic distribution of the permuted test statistic and the asymptotic power of the permutation test under local contiguous alternatives. This establishes that the permutation test is statistically efficient in the Pitman sense. Extensive simulation studies are carried out and a real data set is analyzed to compare the performance of our proposed test with some state-of-the-art methods.

在本文中,我们提出了一种非参数方法,用于解决涉及作为可分离希尔伯特空间 H 的元素建模的函数式随机变量的一般双样本问题。首先,我们提出了一种基于线性投影的一般方法,用于构建 H 上两个概率分布之间的不相似度量。这个度量的插件估计器被用作检验统计量,以构建一般的双样本检验。在零假设和备择假设下,该统计量的大样本分布均可得出。然而,由于极限零分布的量级在分析上是难以处理的,因此该检验使用 permutation 方法进行校准。我们在相当一般的假设条件下证明了所得到的置换检验的大样本一致性。我们还通过为函数式随机变量建立一个新的局部渐近正态性结果,研究了所提出的检验的效率。利用这一结果,我们推导出了在局部连续替代条件下,置换检验统计量的渐近分布和置换检验的渐近功率。这证明了在皮特曼意义上,置换检验在统计上是有效的。我们进行了广泛的模拟研究,并分析了一个真实数据集,以比较我们提出的检验方法与一些最先进方法的性能。
{"title":"Testing distributional equality for functional random variables","authors":"Bilol Banerjee","doi":"10.1016/j.jmva.2024.105318","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105318","url":null,"abstract":"<div><p>In this article, we present a nonparametric method for the general two-sample problem involving functional random variables modeled as elements of a separable Hilbert space <span><math><mi>H</mi></math></span>. First, we present a general recipe based on linear projections to construct a measure of dissimilarity between two probability distributions on <span><math><mi>H</mi></math></span>. In particular, we consider a measure based on the energy statistic and present some of its nice theoretical properties. A plug-in estimator of this measure is used as the test statistic to construct a general two-sample test. Large sample distribution of this statistic is derived both under null and alternative hypotheses. However, since the quantiles of the limiting null distribution are analytically intractable, the test is calibrated using the permutation method. We prove the large sample consistency of the resulting permutation test under fairly general assumptions. We also study the efficiency of the proposed test by establishing a new local asymptotic normality result for functional random variables. Using that result, we derive the asymptotic distribution of the permuted test statistic and the asymptotic power of the permutation test under local contiguous alternatives. This establishes that the permutation test is statistically efficient in the Pitman sense. Extensive simulation studies are carried out and a real data set is analyzed to compare the performance of our proposed test with some state-of-the-art methods.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"203 ","pages":"Article 105318"},"PeriodicalIF":1.6,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140825304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fast and accurate kernel-based independence test with applications to high-dimensional and functional data 基于内核的快速准确独立性测试,适用于高维数据和函数数据
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-20 DOI: 10.1016/j.jmva.2024.105320
Jin-Ting Zhang , Tianming Zhu

Testing the dependency between two random variables is an important inference problem in statistics since many statistical procedures rely on the assumption that the two samples are independent. To test whether two samples are independent, a so-called HSIC (Hilbert–Schmidt Independence Criterion)-based test has been proposed. Its null distribution is approximated either by permutation or a Gamma approximation. In this paper, a new HSIC-based test is proposed. Its asymptotic null and alternative distributions are established. It is shown that the proposed test is root-n consistent. A three-cumulant matched chi-squared-approximation is adopted to approximate the null distribution of the test statistic. By choosing a proper reproducing kernel, the proposed test can be applied to many different types of data including multivariate, high-dimensional, and functional data. Three simulation studies and two real data applications show that in terms of level accuracy, power, and computational cost, the proposed test outperforms several existing tests for multivariate, high-dimensional, and functional data.

测试两个随机变量之间的依赖关系是统计学中的一个重要推断问题,因为许多统计程序都依赖于两个样本是独立的这一假设。为了检验两个样本是否独立,有人提出了基于 HSIC(希尔伯特-施密特独立准则)的检验。其空分布可以用 permutation 或 Gamma 近似值来近似。本文提出了一种新的基于 HSIC 的检验。建立了它的渐近零分布和替代分布。结果表明,所提出的检验是根 n 一致的。本文采用了三积匹配卡方近似法来近似检验统计量的零分布。通过选择适当的重现核,所提出的检验可以应用于多种不同类型的数据,包括多变量、高维和函数数据。三项模拟研究和两项真实数据应用表明,在水平精度、功率和计算成本方面,所提出的检验方法优于现有的几种多变量、高维和函数数据检验方法。
{"title":"A fast and accurate kernel-based independence test with applications to high-dimensional and functional data","authors":"Jin-Ting Zhang ,&nbsp;Tianming Zhu","doi":"10.1016/j.jmva.2024.105320","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105320","url":null,"abstract":"<div><p>Testing the dependency between two random variables is an important inference problem in statistics since many statistical procedures rely on the assumption that the two samples are independent. To test whether two samples are independent, a so-called HSIC (Hilbert–Schmidt Independence Criterion)-based test has been proposed. Its null distribution is approximated either by permutation or a Gamma approximation. In this paper, a new HSIC-based test is proposed. Its asymptotic null and alternative distributions are established. It is shown that the proposed test is root-<span><math><mi>n</mi></math></span> consistent. A three-cumulant matched chi-squared-approximation is adopted to approximate the null distribution of the test statistic. By choosing a proper reproducing kernel, the proposed test can be applied to many different types of data including multivariate, high-dimensional, and functional data. Three simulation studies and two real data applications show that in terms of level accuracy, power, and computational cost, the proposed test outperforms several existing tests for multivariate, high-dimensional, and functional data.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105320"},"PeriodicalIF":1.6,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140807250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate directional tail-weighted dependence measures 多变量定向尾加权依赖性测量法
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-18 DOI: 10.1016/j.jmva.2024.105319
Xiaoting Li, Harry Joe

We propose a new family of directional dependence measures for multivariate distributions. The family of dependence measures is indexed by α1. When α=1, they measure the strength of dependence along different paths to the joint upper or lower orthant. For α large, they become tail-weighted dependence measures that put more weight in the joint upper or lower tails of the distribution. As α, we show the convergence of the directional dependence measures to the multivariate tail dependence function and characterize the convergence pattern with an asymptotic expansion. This expansion leads to a method to estimate the multivariate tail dependence function using weighted least square regression. We develop rank-based sample estimators for the tail-weighted dependence measures and establish their asymptotic distributions. The practical utility of the tail-weighted dependence measures in multivariate tail inference is further demonstrated through their application to a financial dataset.

我们为多元分布提出了一个新的方向依赖性度量系列。当 α=1 时,它们测量的是通向联合正上方或联合正下方的不同路径的依赖强度。当 α 较大时,它们就变成了尾部加权的依赖性度量,在分布的联合上尾或下尾中赋予更多权重。随着α→∞的增大,我们证明了方向依赖度量向多元尾部依赖函数的收敛,并通过渐近展开描述了收敛模式的特征。这一扩展引出了一种使用加权最小二乘法回归估计多元尾部依赖函数的方法。我们为尾部加权依赖性度量开发了基于等级的样本估计器,并建立了它们的渐近分布。通过将其应用于金融数据集,进一步证明了尾加权依赖性度量在多元尾推断中的实用性。
{"title":"Multivariate directional tail-weighted dependence measures","authors":"Xiaoting Li,&nbsp;Harry Joe","doi":"10.1016/j.jmva.2024.105319","DOIUrl":"10.1016/j.jmva.2024.105319","url":null,"abstract":"<div><p>We propose a new family of directional dependence measures for multivariate distributions. The family of dependence measures is indexed by <span><math><mrow><mi>α</mi><mo>≥</mo><mn>1</mn></mrow></math></span>. When <span><math><mrow><mi>α</mi><mo>=</mo><mn>1</mn></mrow></math></span>, they measure the strength of dependence along different paths to the joint upper or lower orthant. For <span><math><mi>α</mi></math></span> large, they become tail-weighted dependence measures that put more weight in the joint upper or lower tails of the distribution. As <span><math><mrow><mi>α</mi><mo>→</mo><mi>∞</mi></mrow></math></span>, we show the convergence of the directional dependence measures to the multivariate tail dependence function and characterize the convergence pattern with an asymptotic expansion. This expansion leads to a method to estimate the multivariate tail dependence function using weighted least square regression. We develop rank-based sample estimators for the tail-weighted dependence measures and establish their asymptotic distributions. The practical utility of the tail-weighted dependence measures in multivariate tail inference is further demonstrated through their application to a financial dataset.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"203 ","pages":"Article 105319"},"PeriodicalIF":1.6,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000265/pdfft?md5=b41054186655fc814404cc641ffc0dfe&pid=1-s2.0-S0047259X24000265-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140768086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A uniform kernel trick for high and infinite-dimensional two-sample problems 高维和无限维二维样本问题的均匀核技巧
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-12 DOI: 10.1016/j.jmva.2024.105317
Javier Cárcamo , Antonio Cuevas , Luis-Alberto Rodríguez

We use a suitable version of the so-called ”kernel trick” to devise two-sample tests, especially focussed on high-dimensional and functional data. Our proposal entails a simplification of the practical problem of selecting an appropriate kernel function. Specifically, we apply a uniform variant of the kernel trick which involves the supremum within a class of kernel-based distances. We obtain the asymptotic distribution of the test statistic under the null and alternative hypotheses. The proofs rely on empirical processes theory, combined with the delta method and Hadamard directional differentiability techniques, and functional Karhunen–Loève-type expansions of the underlying processes. This methodology has some advantages over other standard approaches in the literature. We also give some experimental insight into the performance of our proposal compared to other kernel-based approaches (the original proposal by Borgwardt et al. (2006) and some variants based on splitting methods) as well as tests based on energy distances (Rizzo and Székely, 2017).

我们使用所谓的 "核函数技巧 "的一个合适版本来设计双样本检验,尤其侧重于高维和函数数据。我们的建议需要简化选择适当核函数的实际问题。具体来说,我们应用了核函数技巧的统一变体,它涉及一类基于核函数的距离中的至高点。我们得到了检验统计量在零假设和备择假设下的渐近分布。证明依赖于经验过程理论,结合德尔塔法和哈达玛定向可微分技术,以及基础过程的卡尔胡宁-洛埃夫函数式展开。与文献中的其他标准方法相比,这种方法具有一些优势。我们还通过实验深入分析了我们的建议与其他基于核的方法(Borgwardt 等人(2006 年)的原始建议和一些基于分裂方法的变体)以及基于能量距离的测试(Rizzo 等人,2017 年)相比的性能。
{"title":"A uniform kernel trick for high and infinite-dimensional two-sample problems","authors":"Javier Cárcamo ,&nbsp;Antonio Cuevas ,&nbsp;Luis-Alberto Rodríguez","doi":"10.1016/j.jmva.2024.105317","DOIUrl":"10.1016/j.jmva.2024.105317","url":null,"abstract":"<div><p>We use a suitable version of the so-called ”kernel trick” to devise two-sample tests, especially focussed on high-dimensional and functional data. Our proposal entails a simplification of the practical problem of selecting an appropriate kernel function. Specifically, we apply a uniform variant of the kernel trick which involves the supremum within a class of kernel-based distances. We obtain the asymptotic distribution of the test statistic under the null and alternative hypotheses. The proofs rely on empirical processes theory, combined with the delta method and Hadamard directional differentiability techniques, and functional Karhunen–Loève-type expansions of the underlying processes. This methodology has some advantages over other standard approaches in the literature. We also give some experimental insight into the performance of our proposal compared to other kernel-based approaches (the original proposal by Borgwardt et al. (2006) and some variants based on splitting methods) as well as tests based on energy distances (Rizzo and Székely, 2017).</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105317"},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000241/pdfft?md5=19f44db706891c9aa40d12d1b8b7030a&pid=1-s2.0-S0047259X24000241-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140589405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse online regression algorithm with insensitive loss functions 损失函数不敏感的稀疏在线回归算法
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-03 DOI: 10.1016/j.jmva.2024.105316
Ting Hu , Jing Xiong

Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.

在线学习是机器学习和统计学中的一种高效方法,它在观察到一系列训练实例后迭代更新模型。在线梯度下降算法是一种具有代表性的在线学习算法,由于其复杂度低且可扩展至大型数据集,因此得到了广泛的应用。事实证明,基于核的学习方法在处理数据的非线性和多元优化方面非常成功。在本文中,我们提出了一类基于核的在线梯度下降算法,用于解决回归问题,该算法以迭代方式生成稀疏估计器,以降低大规模学习场景中训练流数据集和模型选择的算法复杂度。在支持向量回归(SVR)的环境中,我们通过引入一系列不敏感的基于距离的损失函数来设计稀疏在线学习算法。我们证明了在温和条件下量化此类算法泛化性能的一致性和误差边界。理论结果证明了学习过程中统计精度和稀疏性之间的相互作用。我们表明,不敏感参数在提供稀疏性和快速收敛率方面起着至关重要的作用。数值实验也支持我们的理论结果。
{"title":"Sparse online regression algorithm with insensitive loss functions","authors":"Ting Hu ,&nbsp;Jing Xiong","doi":"10.1016/j.jmva.2024.105316","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105316","url":null,"abstract":"<div><p>Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105316"},"PeriodicalIF":1.6,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140533309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient calibration of computer models with multivariate output 高效校准多变量输出的计算机模型
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-21 DOI: 10.1016/j.jmva.2024.105315
Yang Sun, Xiangzhong Fang

The classical calibration procedures of computer models only concern the univariate output, which would not be satisfied in practice. Multivariate output is gradually more prevalent in a wide range of real-world applications, which motivates us to develop a new calibration procedure to extend the classical calibration methods to multivariate cases. In this work, we propose an efficient calibration procedure for multivariate output within restricted correlation. First, we construct an estimator of the discrepancy function between the true process and the computer model by the local linear approximation, then obtain an estimator of the calibration parameter by the weighted profile least squares and establish its asymptotic properties. In addition, we also develop an estimator of the calibration parameter in a special situation, whose asymptotic normality has been derived. Numerical studies including simulations and an application to composite fuselage simulation verify the efficiency of the proposed calibration procedure.

计算机模型的经典校准程序只涉及单变量输出,这在实践中无法满足要求。多变量输出在现实世界的广泛应用中越来越普遍,这促使我们开发一种新的校准程序,将经典校准方法扩展到多变量情况。在这项工作中,我们提出了一种在受限相关性内的多变量输出的高效校准程序。首先,我们通过局部线性近似构建了真实过程与计算机模型之间差异函数的估计器,然后通过加权剖面最小二乘法获得了校准参数的估计器,并建立了其渐近特性。此外,我们还开发了一种特殊情况下的校准参数估计器,并推导出其渐近正态性。包括模拟在内的数值研究以及在复合材料机身模拟中的应用验证了所提出的校准程序的效率。
{"title":"Efficient calibration of computer models with multivariate output","authors":"Yang Sun,&nbsp;Xiangzhong Fang","doi":"10.1016/j.jmva.2024.105315","DOIUrl":"10.1016/j.jmva.2024.105315","url":null,"abstract":"<div><p>The classical calibration procedures of computer models only concern the univariate output, which would not be satisfied in practice. Multivariate output is gradually more prevalent in a wide range of real-world applications, which motivates us to develop a new calibration procedure to extend the classical calibration methods to multivariate cases. In this work, we propose an efficient calibration procedure for multivariate output within restricted correlation. First, we construct an estimator of the discrepancy function between the true process and the computer model by the local linear approximation, then obtain an estimator of the calibration parameter by the weighted profile least squares and establish its asymptotic properties. In addition, we also develop an estimator of the calibration parameter in a special situation, whose asymptotic normality has been derived. Numerical studies including simulations and an application to composite fuselage simulation verify the efficiency of the proposed calibration procedure.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105315"},"PeriodicalIF":1.6,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140280995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On extreme quantile region estimation under heavy-tailed elliptical distributions 重尾椭圆分布下的极值量级区域估计
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-20 DOI: 10.1016/j.jmva.2024.105314
Jaakko Pere , Pauliina Ilmonen , Lauri Viitasaari

Consider the estimation of an extreme quantile region corresponding to a very small probability. Estimation of extreme quantile regions is important but difficult since extreme regions contain only a few or no observations. In this article, we propose an affine equivariant extreme quantile region estimator for heavy-tailed elliptical distributions. The estimator is constructed by extending a well-known univariate extreme quantile estimator. Consistency of the estimator is proved under estimated location and scatter. The practicality of the developed estimator is illustrated with simulations and a real data example.

考虑估算与极小概率相对应的极值量级区域。极值量分区域的估计很重要,但却很困难,因为极值区域只包含少数观测值或不包含观测值。在本文中,我们提出了一种针对重尾椭圆分布的仿射等变极端量级区域估计器。该估计器是通过扩展著名的单变量极值量级估计器来构建的。在估计位置和散度条件下,证明了估计器的一致性。通过模拟和真实数据示例说明了所开发估计器的实用性。
{"title":"On extreme quantile region estimation under heavy-tailed elliptical distributions","authors":"Jaakko Pere ,&nbsp;Pauliina Ilmonen ,&nbsp;Lauri Viitasaari","doi":"10.1016/j.jmva.2024.105314","DOIUrl":"10.1016/j.jmva.2024.105314","url":null,"abstract":"<div><p>Consider the estimation of an extreme quantile region corresponding to a very small probability. Estimation of extreme quantile regions is important but difficult since extreme regions contain only a few or no observations. In this article, we propose an affine equivariant extreme quantile region estimator for heavy-tailed elliptical distributions. The estimator is constructed by extending a well-known univariate extreme quantile estimator. Consistency of the estimator is proved under estimated location and scatter. The practicality of the developed estimator is illustrated with simulations and a real data example.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105314"},"PeriodicalIF":1.6,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000216/pdfft?md5=9428a79c05ecd5a039851cfc8de51bac&pid=1-s2.0-S0047259X24000216-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140282482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online stochastic Newton methods for estimating the geometric median and applications 估计几何中值的在线随机牛顿方法及其应用
IF 1.6 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-19 DOI: 10.1016/j.jmva.2024.105313
Antoine Godichon-Baggioni , Wei Lu

In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean. It is difficult to detect automatically these atypical individuals, and an alternative strategy is using robust approaches. This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency. In order to deal with large samples of data arriving sequentially, online stochastic Newton algorithms for estimating the geometric median are introduced and we give their rates of convergence. Since estimates of the median and those of the Hessian matrix can be recursively updated, we also determine confidences intervals of the median in any designated direction and perform online statistical tests.

在大量样本中,少数个体可能会破坏基本的统计指标,如平均值。要自动检测出这些非典型个体是很困难的,另一种策略是使用稳健方法。本文的重点是估计随机变量的几何中值,它是中心倾向的稳健指标。为了处理连续到达的大量数据样本,本文介绍了估算几何中值的在线随机牛顿算法,并给出了其收敛率。由于中位数和黑森矩阵的估计值可以递归更新,我们还确定了中位数在任意指定方向上的置信区间,并进行了在线统计检验。
{"title":"Online stochastic Newton methods for estimating the geometric median and applications","authors":"Antoine Godichon-Baggioni ,&nbsp;Wei Lu","doi":"10.1016/j.jmva.2024.105313","DOIUrl":"https://doi.org/10.1016/j.jmva.2024.105313","url":null,"abstract":"<div><p>In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean. It is difficult to detect automatically these atypical individuals, and an alternative strategy is using robust approaches. This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency. In order to deal with large samples of data arriving sequentially, online stochastic Newton algorithms for estimating the geometric median are introduced and we give their rates of convergence. Since estimates of the median and those of the Hessian matrix can be recursively updated, we also determine confidences intervals of the median in any designated direction and perform online statistical tests.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105313"},"PeriodicalIF":1.6,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140191763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Multivariate Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1