Sparse online regression algorithm with insensitive loss functions

IF 1.4 3区数学 Q2 STATISTICS & PROBABILITY Journal of Multivariate Analysis Pub Date : 2024-04-03 DOI:10.1016/j.jmva.2024.105316

Ting Hu , Jing Xiong

{"title":"Sparse online regression algorithm with insensitive loss functions","authors":"Ting Hu , Jing Xiong","doi":"10.1016/j.jmva.2024.105316","DOIUrl":null,"url":null,"abstract":"<div><p>Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105316"},"PeriodicalIF":1.4000,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X2400023X","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

Abstract

Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

损失函数不敏感的稀疏在线回归算法

在线学习是机器学习和统计学中的一种高效方法，它在观察到一系列训练实例后迭代更新模型。在线梯度下降算法是一种具有代表性的在线学习算法，由于其复杂度低且可扩展至大型数据集，因此得到了广泛的应用。事实证明，基于核的学习方法在处理数据的非线性和多元优化方面非常成功。在本文中，我们提出了一类基于核的在线梯度下降算法，用于解决回归问题，该算法以迭代方式生成稀疏估计器，以降低大规模学习场景中训练流数据集和模型选择的算法复杂度。在支持向量回归（SVR）的环境中，我们通过引入一系列不敏感的基于距离的损失函数来设计稀疏在线学习算法。我们证明了在温和条件下量化此类算法泛化性能的一致性和误差边界。理论结果证明了学习过程中统计精度和稀疏性之间的相互作用。我们表明，不敏感参数在提供稀疏性和快速收敛率方面起着至关重要的作用。数值实验也支持我们的理论结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Multivariate Analysis 数学-统计学与概率论

CiteScore

2.40

自引率

25.00%

发文量

108

审稿时长

74 days

期刊介绍： Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data. The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of Copula modeling Functional data analysis Graphical modeling High-dimensional data analysis Image analysis Multivariate extreme-value theory Sparse modeling Spatial statistics.

期刊最新文献

Consistency of empirical distributions of sequences of graph statistics in networks with dependent edges Semiparametric density estimation with localized Bregman divergence Tree-structured Markov random fields with Poisson marginal distributions Model averaging for global Fréchet regression Classification using global and local Mahalanobis distances