Pub Date : 2026-01-08DOI: 10.1007/s10444-025-10265-5
Van Dung Nguyen, Hoang Thi Kim Hoa
In this paper, we propose an inertial splitting algorithm to compute a zero of the sum of a maximally monotone operator and a monotone and Lipschitz continuous operator. This work aims to extend reflected-forward-backward method by using inertial effects. We prove the convergence of the algorithm in a Hilbert space setting and show that the range of step size can be improved. The linear convergence of the proposed method is obtained under a condition akin to strong monotonicity. We also give some simple numerical experiments to demonstrate the efficiency of the proposed algorithm.
{"title":"An inertial reflected-forward-backward splitting method for monotone inclusions with improved step size","authors":"Van Dung Nguyen, Hoang Thi Kim Hoa","doi":"10.1007/s10444-025-10265-5","DOIUrl":"10.1007/s10444-025-10265-5","url":null,"abstract":"<div><p>In this paper, we propose an inertial splitting algorithm to compute a zero of the sum of a maximally monotone operator and a monotone and Lipschitz continuous operator. This work aims to extend reflected-forward-backward method by using inertial effects. We prove the convergence of the algorithm in a Hilbert space setting and show that the range of step size can be improved. The linear convergence of the proposed method is obtained under a condition akin to strong monotonicity. We also give some simple numerical experiments to demonstrate the efficiency of the proposed algorithm.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"52 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145915716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1007/s10444-025-10275-3
Alessandro Borghi, Tobias Breiten
We consider optimal interpolation of functions analytic in simply connected domains in the complex plane. By choosing a specific structure for the approximant, we show that the resulting first-order optimality conditions can be interpreted as optimal (varvec{mathcal {H}}_{varvec{2}}) interpolation conditions for discrete-time dynamical systems. Connections to model reduction of discrete-time time-invariant delay systems are also established with particular emphasis on discretized linear systems obtained through the implicit Euler method, the midpoint method, and backward differentiation methods. A data-driven algorithm is developed to compute a (locally) optimal approximant. Our method is tested on three numerical experiments.
{"title":"Data-driven optimal approximation on Hardy spaces in simply connected domains","authors":"Alessandro Borghi, Tobias Breiten","doi":"10.1007/s10444-025-10275-3","DOIUrl":"10.1007/s10444-025-10275-3","url":null,"abstract":"<div><p>We consider optimal interpolation of functions analytic in simply connected domains in the complex plane. By choosing a specific structure for the approximant, we show that the resulting first-order optimality conditions can be interpreted as optimal <span>(varvec{mathcal {H}}_{varvec{2}})</span> interpolation conditions for discrete-time dynamical systems. Connections to model reduction of discrete-time time-invariant delay systems are also established with particular emphasis on discretized linear systems obtained through the implicit Euler method, the midpoint method, and backward differentiation methods. A data-driven algorithm is developed to compute a (locally) optimal approximant. Our method is tested on three numerical experiments.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10275-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145703979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1007/s10444-025-10276-2
Zheng-Chu Guo, Lei Shi
Covariate shift refers to the change in the distribution of input data (covariates) between the training and testing phases of a machine learning model. Standard regression typically assumes that training and testing samples come from the same distribution, an assumption that often fails in practice. Various methods have been developed to address covariate shift, with importance weighting being one of the most widely used. While existing literature under covariate shift primarily focuses on batch learning, the high algorithmic complexity of these methods can significantly hinder their performance in big data scenarios. In contrast, online learning processes data incrementally, updating outputs in real time, which allows for more efficient handling of large-scale and streaming datasets. This paper explores the application of importance weighting correction for online learning algorithms in reproducing kernel Hilbert spaces under covariate shift. Our findings demonstrate fast convergence rates for the reweighted online learning algorithms, particularly when the importance weight function has a finite second moment.
{"title":"Online learning algorithms tackling covariate shift","authors":"Zheng-Chu Guo, Lei Shi","doi":"10.1007/s10444-025-10276-2","DOIUrl":"10.1007/s10444-025-10276-2","url":null,"abstract":"<div><p>Covariate shift refers to the change in the distribution of input data (covariates) between the training and testing phases of a machine learning model. Standard regression typically assumes that training and testing samples come from the same distribution, an assumption that often fails in practice. Various methods have been developed to address covariate shift, with importance weighting being one of the most widely used. While existing literature under covariate shift primarily focuses on batch learning, the high algorithmic complexity of these methods can significantly hinder their performance in big data scenarios. In contrast, online learning processes data incrementally, updating outputs in real time, which allows for more efficient handling of large-scale and streaming datasets. This paper explores the application of importance weighting correction for online learning algorithms in reproducing kernel Hilbert spaces under covariate shift. Our findings demonstrate fast convergence rates for the reweighted online learning algorithms, particularly when the importance weight function has a finite second moment.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145703988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s10444-025-10271-7
Axel Flinth, Ingo Roth, Gerhard Wunder
The hierarchical sparsity framework, and in particular the HiHTP algorithm(Hierarchical Hard Thresholding Pursuit), has been successfully applied to many relevant communication engineering problems recently, particularly when the signal space is hierarchically structured. In this paper, the applicability of the HiHTP algorithm for solving the bi-sparse blind deconvolution problem is studied. The bi-sparse blind deconvolution setting here consists of recovering (varvec{h}) and (varvec{b}) from the knowledge of (varvec{h}varvec{*}varvec{(Qb)}), where (varvec{Q}) is some linear operator, and both (varvec{b}) and (varvec{h}) are assumed to be sparse. The approach rests upon lifting the problem to a linear one, and then applying HiHTP, through the hierarchical sparsity framework. Then, for a Gaussian draw of the random matrix (varvec{Q}), it is theoretically shown that an (varvec{s})-sparse (varvec{h} varvec{in } varvec{mathbb {K}}^{varvec{mu }}) and (varvec{sigma })-sparse (varvec{b} varvec{in } varvec{mathbb {K}}^{varvec{n}}) with high probability can be recovered when (varvec{mu } varvec{gtrsim } varvec{s}, varvec{log }varvec{(s)}^{varvec{2}}, varvec{log }varvec{(mu )}, varvec{log }varvec{(mu n)} varvec{+} varvec{s}varvec{sigma }, varvec{log }varvec{(n)}).
{"title":"Bisparse blind deconvolution through hierarchical sparse recovery","authors":"Axel Flinth, Ingo Roth, Gerhard Wunder","doi":"10.1007/s10444-025-10271-7","DOIUrl":"10.1007/s10444-025-10271-7","url":null,"abstract":"<div><p>The <i>hierarchical sparsity framework</i>, and in particular the HiHTP algorithm(Hierarchical Hard Thresholding Pursuit), has been successfully applied to many relevant communication engineering problems recently, particularly when the signal space is hierarchically structured. In this paper, the applicability of the HiHTP algorithm for solving the bi-sparse blind deconvolution problem is studied. The bi-sparse blind deconvolution setting here consists of recovering <span>(varvec{h})</span> and <span>(varvec{b})</span> from the knowledge of <span>(varvec{h}varvec{*}varvec{(Qb)})</span>, where <span>(varvec{Q})</span> is some linear operator, and both <span>(varvec{b})</span> and <span>(varvec{h})</span> are assumed to be sparse. The approach rests upon lifting the problem to a linear one, and then applying HiHTP, through the <i>hierarchical sparsity framework</i>. Then, for a Gaussian draw of the random matrix <span>(varvec{Q})</span>, it is theoretically shown that an <span>(varvec{s})</span>-sparse <span>(varvec{h} varvec{in } varvec{mathbb {K}}^{varvec{mu }})</span> and <span>(varvec{sigma })</span>-sparse <span>(varvec{b} varvec{in } varvec{mathbb {K}}^{varvec{n}})</span> with high probability can be recovered when <span>(varvec{mu } varvec{gtrsim } varvec{s}, varvec{log }varvec{(s)}^{varvec{2}}, varvec{log }varvec{(mu )}, varvec{log }varvec{(mu n)} varvec{+} varvec{s}varvec{sigma }, varvec{log }varvec{(n)})</span>.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10271-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s10444-025-10268-2
Arash Ghaani Farashahi, Gregory S. Chirikjian
This paper develops a constructive numerical scheme for Fourier-Bessel approximations on disks compatible with convolutions supported on disks. We address accurate finite Fourier-Bessel transforms (FFBT) and inverse finite Fourier-Bessel transforms (iFFBT) of functions on disks using the discrete Fourier Transform (DFT) on Cartesian grids. Whereas the DFT and its fast implementation (FFT) are ubiquitous and are powerful for computing convolutions, they are not exactly steerable under rotations. In contrast, Fourier-Bessel expansions are steerable, but lose both this property and the preservation of band limits under convolution. This work captures the best features of both as the band limit is allowed to increase. The convergence/error analysis and asymptotic steerability of FFBT/iFFBT are investigated. Conditions are established for the FFBT to converge to the Fourier-Bessel coefficient and for the iFFBT to uniformly approximate the Fourier-Bessel partial sums. The matrix form of the finite transforms is discussed. The implementation of the discrete method to compute numerical approximation of convolutions of compactly supported functions on disks is considered as well.
{"title":"Asymptotically steerable finite Fourier-Bessel transforms and closure under convolution","authors":"Arash Ghaani Farashahi, Gregory S. Chirikjian","doi":"10.1007/s10444-025-10268-2","DOIUrl":"10.1007/s10444-025-10268-2","url":null,"abstract":"<div><p>This paper develops a constructive numerical scheme for Fourier-Bessel approximations on disks compatible with convolutions supported on disks. We address accurate finite Fourier-Bessel transforms (FFBT) and inverse finite Fourier-Bessel transforms (iFFBT) of functions on disks using the discrete Fourier Transform (DFT) on Cartesian grids. Whereas the DFT and its fast implementation (FFT) are ubiquitous and are powerful for computing convolutions, they are not exactly steerable under rotations. In contrast, Fourier-Bessel expansions are steerable, but lose both this property and the preservation of band limits under convolution. This work captures the best features of both as the band limit is allowed to increase. The convergence/error analysis and asymptotic steerability of FFBT/iFFBT are investigated. Conditions are established for the FFBT to converge to the Fourier-Bessel coefficient and for the iFFBT to uniformly approximate the Fourier-Bessel partial sums. The matrix form of the finite transforms is discussed. The implementation of the discrete method to compute numerical approximation of convolutions of compactly supported functions on disks is considered as well.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10268-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1007/s10444-025-10252-w
Hendrik Speleers
Trigonometric and hyperbolic B-splines can be computed via recurrence relations analogous to the classical polynomial B-splines. However, in their original formulation, these two types of B-splines do not form a partition of unity and consequently do not admit the notion of control polygons with the convex hull property for design purposes. In this paper, we look into explicit expressions for their normalization and provide a recursive algorithm to compute the corresponding normalization weights. As example application, we consider the exact representation of a circle in terms of (C^{2n-1}) trigonometric B-splines of order (m=2n+1ge 3), with a variable number of control points. We also illustrate the approximation power of trigonometric and hyperbolic splines.
三角b样条和双曲b样条可以通过类似于经典多项式b样条的递推关系来计算。然而,在它们的原始公式中,这两种类型的b样条并没有形成统一的分割,因此不承认具有凸壳性质的控制多边形的概念用于设计目的。在本文中,我们研究了它们的归一化的显式表达式,并提供了一个递归算法来计算相应的归一化权重。作为示例应用,我们考虑了一个圆的精确表示为$$C^{2n-1}$$ c2n - 1阶($$m=2n+1ge 3$$ m = 2n + 1≥3)的三角b样条曲线,其控制点的数目是可变的。我们还说明了三角样条和双曲样条的逼近能力。
{"title":"On the normalization of trigonometric and hyperbolic B-splines","authors":"Hendrik Speleers","doi":"10.1007/s10444-025-10252-w","DOIUrl":"10.1007/s10444-025-10252-w","url":null,"abstract":"<div><p>Trigonometric and hyperbolic B-splines can be computed via recurrence relations analogous to the classical polynomial B-splines. However, in their original formulation, these two types of B-splines do not form a partition of unity and consequently do not admit the notion of control polygons with the convex hull property for design purposes. In this paper, we look into explicit expressions for their normalization and provide a recursive algorithm to compute the corresponding normalization weights. As example application, we consider the exact representation of a circle in terms of <span>(C^{2n-1})</span> trigonometric B-splines of order <span>(m=2n+1ge 3)</span>, with a variable number of control points. We also illustrate the approximation power of trigonometric and hyperbolic splines.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10252-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1007/s10444-025-10269-1
Jan Heiland, Yongho Kim, Steffen W. R. Werner
Polytopic autoencoders provide low-dimensional parametrizations of states in a polytope. For nonlinear partial differential equations (PDEs), this is readily applied to low-dimensional linear parameter-varying (LPV) approximations as they have been exploited for efficient nonlinear controller design via series expansions of the solution to the state-dependent Riccati equation. In this work, we develop a polytopic autoencoder for control applications and show how it improves on standard linear approaches in view of LPV approximations of nonlinear systems. We discuss how the particular architecture enables exact representations of target states and higher-order series expansions of the nonlinear feedback law at little extra computational effort in the online phase. In the offline phase, a system of linear though high-dimensional and nonstandard Lyapunov equations has to be solved. Here, we expand on how to adapt state-of-the-art methods for the efficient numerical treatment. In a numerical study, we illustrate the procedure and how this approach can reliably outperform the standard linear-quadratic regulator design.
{"title":"Deep polytopic autoencoders for low-dimensional linear parameter-varying approximations and nonlinear feedback controller design","authors":"Jan Heiland, Yongho Kim, Steffen W. R. Werner","doi":"10.1007/s10444-025-10269-1","DOIUrl":"10.1007/s10444-025-10269-1","url":null,"abstract":"<div><p>Polytopic autoencoders provide low-dimensional parametrizations of states in a polytope. For nonlinear partial differential equations (PDEs), this is readily applied to low-dimensional linear parameter-varying (LPV) approximations as they have been exploited for efficient nonlinear controller design via series expansions of the solution to the state-dependent Riccati equation. In this work, we develop a polytopic autoencoder for control applications and show how it improves on standard linear approaches in view of LPV approximations of nonlinear systems. We discuss how the particular architecture enables exact representations of target states and higher-order series expansions of the nonlinear feedback law at little extra computational effort in the online phase. In the offline phase, a system of linear though high-dimensional and nonstandard Lyapunov equations has to be solved. Here, we expand on how to adapt state-of-the-art methods for the efficient numerical treatment. In a numerical study, we illustrate the procedure and how this approach can reliably outperform the standard linear-quadratic regulator design.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10269-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145645273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1007/s10444-025-10272-6
Yunqing Huang, Jichun Li, Xin Liu
In this paper, we are interested in studying the interaction of light with metallic nanostructures modeled by a non-local hydrodynamic Drude model, which consists of a system of partial differential equations coupled to Maxwell’s equations. Solving this model is interesting but challenging, since it needs not only the curl conforming basis function as for the standard Maxwell’s equations, but also the divergence conforming basis function. Several novel finite element schemes are proposed and analyzed. Numerical results are presented to justify our theoretical analysis. This is the first paper on solving this time-domain non-local hydrodynamic Drude model with only the electric field and polarization current as unknowns.
{"title":"Developing and analyzing some new finite element methods for a non-local hydrodynamic Drude model","authors":"Yunqing Huang, Jichun Li, Xin Liu","doi":"10.1007/s10444-025-10272-6","DOIUrl":"10.1007/s10444-025-10272-6","url":null,"abstract":"<div><p>In this paper, we are interested in studying the interaction of light with metallic nanostructures modeled by a non-local hydrodynamic Drude model, which consists of a system of partial differential equations coupled to Maxwell’s equations. Solving this model is interesting but challenging, since it needs not only the curl conforming basis function as for the standard Maxwell’s equations, but also the divergence conforming basis function. Several novel finite element schemes are proposed and analyzed. Numerical results are presented to justify our theoretical analysis. This is the first paper on solving this time-domain non-local hydrodynamic Drude model with only the electric field and polarization current as unknowns.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145645278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-14DOI: 10.1007/s10444-025-10267-3
Francesco Dell’Accio, Francisco Marcellán, Federico Nudo
The interpolation–regression approximation is a powerful tool in numerical analysis for reconstructing functions defined on square or triangular domains from their evaluations at a regular set of nodes. The importance of this technique lies in its ability to avoid the Runge phenomenon. In this paper, we present a polynomial approximation method based on an interpolation–regression approach for reconstructing functions defined on disk domains from their evaluations at a general set of sampling points. Special attention is devoted to the selection of interpolation nodes to ensure numerical stability, particularly in the context of Zernike polynomials. As an application, the proposed method is used to derive accurate cubature formulas for numerical integration over the disk.
{"title":"An interpolation–regression approach for function approximation on the disk and its application to cubature formulas","authors":"Francesco Dell’Accio, Francisco Marcellán, Federico Nudo","doi":"10.1007/s10444-025-10267-3","DOIUrl":"10.1007/s10444-025-10267-3","url":null,"abstract":"<div><p>The interpolation–regression approximation is a powerful tool in numerical analysis for reconstructing functions defined on square or triangular domains from their evaluations at a regular set of nodes. The importance of this technique lies in its ability to avoid the Runge phenomenon. In this paper, we present a polynomial approximation method based on an interpolation–regression approach for reconstructing functions defined on disk domains from their evaluations at a general set of sampling points. Special attention is devoted to the selection of interpolation nodes to ensure numerical stability, particularly in the context of Zernike polynomials. As an application, the proposed method is used to derive accurate cubature formulas for numerical integration over the disk.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10444-025-10267-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145509682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-28DOI: 10.1007/s10444-025-10263-7
Qi Hong, Zheng-Chu Guo
In large-scale machine learning, the computational cost of kernel methods can become prohibitive due to the need to compute pairwise kernel evaluations on extensive datasets. The random feature method is one of the most popular techniques for accelerating kernel methods in large-scale problems while maintaining statistical accuracy. In this paper, we investigate the generalization properties of a robust gradient descent algorithm utilizing random features within a statistical learning framework, where we employ the robust loss function (l_{sigma }) instead of the traditional squared loss during training. This loss function is defined by a windowing function G and a scale parameter (sigma ), allowing it to encompass a wide range of commonly used robust losses for regression when G and (sigma ) are appropriately selected. However, it remains unclear whether the random feature method can preserve statistical accuracy in this context. We analyze the generalization error of the estimator produced by the gradient descent algorithm with random features. Our findings demonstrate that with a suitably chosen scale parameter (sigma ) and an appropriate number of random features M, our estimator can converge to the regression function in (L^2)-norm at optimal rates in the mini-max sense (up to a logarithmic term), even if the regression function may not reside in the reproducing kernel Hilbert space.
{"title":"Robust kernel-based gradient descent with random features","authors":"Qi Hong, Zheng-Chu Guo","doi":"10.1007/s10444-025-10263-7","DOIUrl":"10.1007/s10444-025-10263-7","url":null,"abstract":"<div><p>In large-scale machine learning, the computational cost of kernel methods can become prohibitive due to the need to compute pairwise kernel evaluations on extensive datasets. The random feature method is one of the most popular techniques for accelerating kernel methods in large-scale problems while maintaining statistical accuracy. In this paper, we investigate the generalization properties of a robust gradient descent algorithm utilizing random features within a statistical learning framework, where we employ the robust loss function <span>(l_{sigma })</span> instead of the traditional squared loss during training. This loss function is defined by a windowing function <i>G</i> and a scale parameter <span>(sigma )</span>, allowing it to encompass a wide range of commonly used robust losses for regression when <i>G</i> and <span>(sigma )</span> are appropriately selected. However, it remains unclear whether the random feature method can preserve statistical accuracy in this context. We analyze the generalization error of the estimator produced by the gradient descent algorithm with random features. Our findings demonstrate that with a suitably chosen scale parameter <span>(sigma )</span> and an appropriate number of random features <i>M</i>, our estimator can converge to the regression function in <span>(L^2)</span>-norm at optimal rates in the mini-max sense (up to a logarithmic term), even if the regression function may not reside in the reproducing kernel Hilbert space.</p></div>","PeriodicalId":50869,"journal":{"name":"Advances in Computational Mathematics","volume":"51 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145382467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}