Pub Date : 2025-08-18DOI: 10.1016/j.jco.2025.101981
Simon Foucart
Predicting the value of a function f at a new point given its values at old points is an ubiquitous scientific endeavor, somewhat less developed when f produces several values depending on one another, e.g. when it outputs a probability vector. Considering the points as fixed (not random) entities and focusing on the worst-case, this article uncovers a prediction procedure that is optimal relatively to some model-set information about the vector-valued function f. When the model sets are convex, this procedure turns out to be an affine map constructed by solving a convex optimization program. The theoretical result is specified in the two practical frameworks of (reproducing kernel) Hilbert spaces and of spaces of continuous functions.
{"title":"Optimal prediction of vector-valued functions from point samples","authors":"Simon Foucart","doi":"10.1016/j.jco.2025.101981","DOIUrl":"10.1016/j.jco.2025.101981","url":null,"abstract":"<div><div>Predicting the value of a function <em>f</em> at a new point given its values at old points is an ubiquitous scientific endeavor, somewhat less developed when <em>f</em> produces several values depending on one another, e.g. when it outputs a probability vector. Considering the points as fixed (not random) entities and focusing on the worst-case, this article uncovers a prediction procedure that is optimal relatively to some model-set information about the vector-valued function <em>f</em>. When the model sets are convex, this procedure turns out to be an affine map constructed by solving a convex optimization program. The theoretical result is specified in the two practical frameworks of (reproducing kernel) Hilbert spaces and of spaces of continuous functions.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"92 ","pages":"Article 101981"},"PeriodicalIF":1.8,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144867183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-29DOI: 10.1016/j.jco.2025.101979
Víctor Blanco , Victor Magron , Miguel Martínez-Antón
This manuscript explores novel complexity results for the feasibility problem over p-order cones, extending the foundational work of Porkolab and Khachiyan (1997) [30]. By leveraging the intrinsic structure of p-order cones, we derive refined complexity bounds that surpass those obtained via standard semidefinite programming reformulations. Our analysis not only improves theoretical bounds but also provides practical insights into the computational efficiency of solving such problems. In addition to establishing complexity results, we derive explicit bounds for solutions when the feasibility problem admits one. For infeasible instances, we analyze their discrepancy quantifying the degree of infeasibility. Finally, we examine specific cases of interest, highlighting scenarios where the geometry of p-order cones or problem structure yields further computational simplifications. These findings contribute to both the theoretical understanding and practical tractability of optimization problems involving p-order cones.
{"title":"On the complexity of p-order cone programs","authors":"Víctor Blanco , Victor Magron , Miguel Martínez-Antón","doi":"10.1016/j.jco.2025.101979","DOIUrl":"10.1016/j.jco.2025.101979","url":null,"abstract":"<div><div>This manuscript explores novel complexity results for the feasibility problem over <em>p</em>-order cones, extending the foundational work of Porkolab and Khachiyan (1997) <span><span>[30]</span></span>. By leveraging the intrinsic structure of <em>p</em>-order cones, we derive refined complexity bounds that surpass those obtained via standard semidefinite programming reformulations. Our analysis not only improves theoretical bounds but also provides practical insights into the computational efficiency of solving such problems. In addition to establishing complexity results, we derive explicit bounds for solutions when the feasibility problem admits one. For infeasible instances, we analyze their discrepancy quantifying the degree of infeasibility. Finally, we examine specific cases of interest, highlighting scenarios where the geometry of <em>p</em>-order cones or problem structure yields further computational simplifications. These findings contribute to both the theoretical understanding and practical tractability of optimization problems involving <em>p</em>-order cones.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"91 ","pages":"Article 101979"},"PeriodicalIF":1.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28DOI: 10.1016/j.jco.2025.101978
William Kengne , Modou Wade
This paper considers deep learning from strongly mixing observations and performs a sparse-penalized regularization for deep neural networks (DNN) predictors. In a general framework that includes regression and classification, oracle inequalities for the expected excess risk are established, and upper bounds on the class of Hölder smooth functions and composition structured Hölder functions are provided. For nonparametric autoregression with the Gaussian and Laplace errors, and the Huber loss function, it is shown that the sparse-penalized DNN estimator proposed is optimal (up to a logarithmic factor) in the minimax sense. Based on the lower bound established in Alquier and Kengne (2024), we show that the proposed DNN estimator for the classification task with the logistic loss on strongly mixing observations achieves (up to a logarithmic factor), the minimax optimal convergence rate.
{"title":"Deep learning from strongly mixing observations: Sparse-penalized regularization and minimax optimality","authors":"William Kengne , Modou Wade","doi":"10.1016/j.jco.2025.101978","DOIUrl":"10.1016/j.jco.2025.101978","url":null,"abstract":"<div><div>This paper considers deep learning from strongly mixing observations and performs a sparse-penalized regularization for deep neural networks (DNN) predictors. In a general framework that includes regression and classification, oracle inequalities for the expected excess risk are established, and upper bounds on the class of Hölder smooth functions and composition structured Hölder functions are provided. For nonparametric autoregression with the Gaussian and Laplace errors, and the Huber loss function, it is shown that the sparse-penalized DNN estimator proposed is optimal (up to a logarithmic factor) in the minimax sense. Based on the lower bound established in Alquier and Kengne (2024), we show that the proposed DNN estimator for the classification task with the logistic loss on strongly mixing observations achieves (up to a logarithmic factor), the minimax optimal convergence rate.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"92 ","pages":"Article 101978"},"PeriodicalIF":1.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-18DOI: 10.1016/j.jco.2025.101977
Heng Chen , Di-Rong Chen , Kun Cheng , Yang Zhou
Reduced-rank regression (RRR) has been widely used to strength the dependency among multiple outputs. This paper develops a regularized vector-valued RRR approach, which plays an important role in predicting multiple outputs with structures. The estimator of vector-valued RRR is obtained by minimizing the empirically squared reproducing kernel Hilbert space (RKHS) distances between output feature kernel and all r dimensional subspaces in vector-valued RKHS. The algorithm is implemented easily with kernel tricks. We establish the learning rate of vector-valued RRR estimator under mild assumptions. Moreover, as a reduced-dimensional approximation of output kernel regression function, the estimator converges to the output regression function in probability when the rank r tends to infinity appropriately. It implies the consistency of structured predictor in general settings, especially in a misspecified case where the true regression function is not contained in the hypothesis space. Numerical experiments are provided to illustrate the efficiency of our method.
{"title":"Regularized reduced-rank regression for structured output prediction","authors":"Heng Chen , Di-Rong Chen , Kun Cheng , Yang Zhou","doi":"10.1016/j.jco.2025.101977","DOIUrl":"10.1016/j.jco.2025.101977","url":null,"abstract":"<div><div>Reduced-rank regression (RRR) has been widely used to strength the dependency among multiple outputs. This paper develops a regularized vector-valued RRR approach, which plays an important role in predicting multiple outputs with structures. The estimator of vector-valued RRR is obtained by minimizing the empirically squared reproducing kernel Hilbert space (RKHS) distances between output feature kernel and all <em>r</em> dimensional subspaces in vector-valued RKHS. The algorithm is implemented easily with kernel tricks. We establish the learning rate of vector-valued RRR estimator under mild assumptions. Moreover, as a reduced-dimensional approximation of output kernel regression function, the estimator converges to the output regression function in probability when the rank <em>r</em> tends to infinity appropriately. It implies the consistency of structured predictor in general settings, especially in a misspecified case where the true regression function is not contained in the hypothesis space. Numerical experiments are provided to illustrate the efficiency of our method.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"92 ","pages":"Article 101977"},"PeriodicalIF":1.8,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144679194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1016/j.jco.2025.101976
Richard Löscher, Michael Reichelt, Olaf Steinbach
In this paper we consider a distributed optimal control problem subject to a parabolic evolution equation as constraint. The approach presented here is based on the variational formulation of the parabolic evolution equation in anisotropic Sobolev spaces, considering the control in . Since the state equation defines an isomorphism from onto , we can eliminate the control to end up with a minimization problem in where the anisotropic Sobolev norm can be realized using a modified Hilbert transformation. In the unconstrained case, the minimizer is the unique solution of a singularly perturbed elliptic equation. In the case of a space-time tensor-product mesh, we can use sparse factorization techniques to construct a solver of almost linear complexity. Numerical examples also include additional state constraints, and a nonlinear state equation.
{"title":"Optimal complexity solution of space-time finite element systems for state-based parabolic distributed optimal control problems","authors":"Richard Löscher, Michael Reichelt, Olaf Steinbach","doi":"10.1016/j.jco.2025.101976","DOIUrl":"10.1016/j.jco.2025.101976","url":null,"abstract":"<div><div>In this paper we consider a distributed optimal control problem subject to a parabolic evolution equation as constraint. The approach presented here is based on the variational formulation of the parabolic evolution equation in anisotropic Sobolev spaces, considering the control in <span><math><msup><mrow><mo>[</mo><msubsup><mrow><mi>H</mi></mrow><mrow><mn>0</mn><mo>;</mo><mo>,</mo><mn>0</mn></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msubsup><mo>(</mo><mi>Q</mi><mo>)</mo><mo>]</mo></mrow><mrow><mo>⁎</mo></mrow></msup></math></span>. Since the state equation defines an isomorphism from <span><math><msubsup><mrow><mi>H</mi></mrow><mrow><mn>0</mn><mo>;</mo><mn>0</mn><mo>,</mo></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msubsup><mo>(</mo><mi>Q</mi><mo>)</mo></math></span> onto <span><math><msup><mrow><mo>[</mo><msubsup><mrow><mi>H</mi></mrow><mrow><mn>0</mn><mo>;</mo><mo>,</mo><mn>0</mn></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn></mrow></msubsup><mo>(</mo><mi>Q</mi><mo>)</mo><mo>]</mo></mrow><mrow><mo>⁎</mo></mrow></msup></math></span>, we can eliminate the control to end up with a minimization problem in <span><math><msubsup><mrow><mi>H</mi></mrow><mrow><mn>0</mn><mo>;</mo><mn>0</mn><mo>,</mo></mrow><mrow><mn>1</mn><mo>,</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msubsup><mo>(</mo><mi>Q</mi><mo>)</mo></math></span> where the anisotropic Sobolev norm can be realized using a modified Hilbert transformation. In the unconstrained case, the minimizer is the unique solution of a singularly perturbed elliptic equation. In the case of a space-time tensor-product mesh, we can use sparse factorization techniques to construct a solver of almost linear complexity. Numerical examples also include additional state constraints, and a nonlinear state equation.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"92 ","pages":"Article 101976"},"PeriodicalIF":1.8,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-04DOI: 10.1016/j.jco.2025.101975
Y.V. Semenova , S.G. Solodky
In this paper, we study the problems of numerical differentiation and summation of univariate functions from the weighted Wiener classes. To solve these problems, we propose an approach based on the truncation method. The essence of this method is to replace the infinite Fourier series with a finite sum. It is only necessary to properly select the order of this sum, which plays the role of a regularization parameter here. The results show that the proposed approach not only ensures a stability of approximations and does not require cumbersome computational procedures, but also constructs algorithms that achieve the optimal order of accuracy using the minimal amount of perturbed values of Fourier-Chebyshev coefficients. Moreover, we establish under what conditions the summation problem is well-posed on the considered function classes.
{"title":"On optimal recovery and information complexity in numerical differentiation and summation","authors":"Y.V. Semenova , S.G. Solodky","doi":"10.1016/j.jco.2025.101975","DOIUrl":"10.1016/j.jco.2025.101975","url":null,"abstract":"<div><div>In this paper, we study the problems of numerical differentiation and summation of univariate functions from the weighted Wiener classes. To solve these problems, we propose an approach based on the truncation method. The essence of this method is to replace the infinite Fourier series with a finite sum. It is only necessary to properly select the order of this sum, which plays the role of a regularization parameter here. The results show that the proposed approach not only ensures a stability of approximations and does not require cumbersome computational procedures, but also constructs algorithms that achieve the optimal order of accuracy using the minimal amount of perturbed values of Fourier-Chebyshev coefficients. Moreover, we establish under what conditions the summation problem is well-posed on the considered function classes.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"92 ","pages":"Article 101975"},"PeriodicalIF":1.8,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-30DOI: 10.1016/j.jco.2025.101974
M. Trödler , J. Volec , J. Vybíral
We elaborate on the intimate connection between the largest volume of an empty axis-parallel box in a set of n points from and cover-free families from the extremal set theory. This connection was discovered in a recent paper of the authors. In this work, we apply a very recent result of Michel and Scott to obtain a whole range of new lower bounds on the number of points needed so that the largest volume of such a box is bounded by a given ε. Surprisingly, it turns out that for each of the new bounds, there is a choice of the parameters d and ε such that the bound outperforms the others.
{"title":"Lower bounds on the minimal dispersion of point sets via cover-free families","authors":"M. Trödler , J. Volec , J. Vybíral","doi":"10.1016/j.jco.2025.101974","DOIUrl":"10.1016/j.jco.2025.101974","url":null,"abstract":"<div><div>We elaborate on the intimate connection between the largest volume of an empty axis-parallel box in a set of <em>n</em> points from <span><math><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup></math></span> and cover-free families from the extremal set theory. This connection was discovered in a recent paper of the authors. In this work, we apply a very recent result of Michel and Scott to obtain a whole range of new lower bounds on the number of points needed so that the largest volume of such a box is bounded by a given <em>ε</em>. Surprisingly, it turns out that for each of the new bounds, there is a choice of the parameters <em>d</em> and <em>ε</em> such that the bound outperforms the others.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"91 ","pages":"Article 101974"},"PeriodicalIF":1.8,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-30DOI: 10.1016/j.jco.2025.101973
A. Zlotnik
A compact three-level fourth-order finite-difference scheme for solving the 1d wave equation is studied. New error bounds of the fractional order are proved in the mesh energy norm in terms of data, for two initial functions from the Sobolev and Nikolskii spaces with the smoothness orders λ and and the free term with a dominated mixed smoothness of order , for . The corresponding lower error bounds are proved as well to ensure the sharpness in order of the above error bounds with respect to each of the initial functions and the free term for any λ. Moreover, they demonstrate that the upper error bounds cannot be improved if the Lebesgue summability indices in the error norm are weakened down to 1 both in x and t and simultaneously the summability indices in the norms of data are strengthened up to ∞ both in x and t. Numerical experiments confirming the sharpness of the mentioned orders for half-integer λ and piecewise polynomial data have already been carried out previously.
{"title":"Upper and lower error bounds for a compact fourth-order finite-difference scheme for the wave equation with nonsmooth data","authors":"A. Zlotnik","doi":"10.1016/j.jco.2025.101973","DOIUrl":"10.1016/j.jco.2025.101973","url":null,"abstract":"<div><div>A compact three-level fourth-order finite-difference scheme for solving the 1d wave equation is studied. New error bounds of the fractional order <span><math><mi>O</mi><mo>(</mo><msup><mrow><mi>h</mi></mrow><mrow><mn>4</mn><mo>(</mo><mi>λ</mi><mo>−</mo><mn>1</mn><mo>)</mo><mo>/</mo><mn>5</mn></mrow></msup><mo>)</mo></math></span> are proved in the mesh energy norm in terms of data, for two initial functions from the Sobolev and Nikolskii spaces with the smoothness orders <em>λ</em> and <span><math><mi>λ</mi><mo>−</mo><mn>1</mn></math></span> and the free term with a dominated mixed smoothness of order <span><math><mi>λ</mi><mo>−</mo><mn>1</mn></math></span>, for <span><math><mn>1</mn><mo>⩽</mo><mi>λ</mi><mo>⩽</mo><mn>6</mn></math></span>. The corresponding lower error bounds are proved as well to ensure the sharpness in order of the above error bounds with respect to each of the initial functions and the free term for any <em>λ</em>. Moreover, they demonstrate that the upper error bounds cannot be improved if the Lebesgue summability indices in the error norm are weakened down to 1 both in <em>x</em> and <em>t</em> and simultaneously the summability indices in the norms of data are strengthened up to ∞ both in <em>x</em> and <em>t</em>. Numerical experiments confirming the sharpness of the mentioned orders for half-integer <em>λ</em> and piecewise polynomial data have already been carried out previously.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"91 ","pages":"Article 101973"},"PeriodicalIF":1.8,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144549672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-23DOI: 10.1016/j.jco.2025.101972
M.P. Rajan, Niloopher Salam
Nonlinear inverse and ill-posed problems occur in many practical applications and the regularization techniques are employed to get a stable approximate solution for the same. Although many schemes are available in literature, iterative regularization techniques are the most commonly used approaches. One such important method is the Levenberg-Marquardt scheme. However, the scheme involves computation of the Fréchet derivative at every iterate which makes it tedious and the restrictive assumptions on it often difficult to verify for practical scenarios. In this paper, we propose a simplified Levenberg-Marquardt scheme that has two benefits. Firstly, computation of the Fréchet derivative is required only once at the initial point and secondly, the convergence and optimal convergence rate of the method is established with weaker assumptions as compared to the standard method. We also provide numerical examples to illustrate the theory and, results clearly illustrate the advantages of the proposed scheme over the standard method.
{"title":"Convergence analysis of a regularized iterative scheme for solving nonlinear problems","authors":"M.P. Rajan, Niloopher Salam","doi":"10.1016/j.jco.2025.101972","DOIUrl":"10.1016/j.jco.2025.101972","url":null,"abstract":"<div><div>Nonlinear inverse and ill-posed problems occur in many practical applications and the regularization techniques are employed to get a stable approximate solution for the same. Although many schemes are available in literature, iterative regularization techniques are the most commonly used approaches. One such important method is the Levenberg-Marquardt scheme. However, the scheme involves computation of the Fréchet derivative at every iterate which makes it tedious and the restrictive assumptions on it often difficult to verify for practical scenarios. In this paper, we propose a simplified Levenberg-Marquardt scheme that has two benefits. Firstly, computation of the Fréchet derivative is required only once at the initial point and secondly, the convergence and optimal convergence rate of the method is established with weaker assumptions as compared to the standard method. We also provide numerical examples to illustrate the theory and, results clearly illustrate the advantages of the proposed scheme over the standard method.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"91 ","pages":"Article 101972"},"PeriodicalIF":1.8,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-19DOI: 10.1016/j.jco.2025.101971
Nikola Sandrić
Most supervised learning methods assume training data is drawn from an i.i.d. sample. However, real-world problems often exhibit temporal dependence and strong correlations between marginals of the data-generating process, rendering the i.i.d. assumption unrealistic. Such cases naturally involve time-series processes and Markov chains. The learning rates typically obtained in these settings remain independent of the data distribution, potentially leading to restrictive hypothesis classes and suboptimal sample complexities. We consider training data generated by an iterated random function that need not be irreducible or aperiodic. Assuming the governing function is contractive in its first argument and subject to certain regularity conditions on the hypothesis class, we first establish uniform convergence for the sample error. We then prove learnability of approximate empirical risk minimization and derive its learning rate bound. Both bounds depend explicitly on the data distribution through the Rademacher complexities of the hypothesis class, thereby better capturing properties of the data-generating distribution.
{"title":"Rademacher learning rates for iterated random functions","authors":"Nikola Sandrić","doi":"10.1016/j.jco.2025.101971","DOIUrl":"10.1016/j.jco.2025.101971","url":null,"abstract":"<div><div>Most supervised learning methods assume training data is drawn from an i.i.d. sample. However, real-world problems often exhibit temporal dependence and strong correlations between marginals of the data-generating process, rendering the i.i.d. assumption unrealistic. Such cases naturally involve time-series processes and Markov chains. The learning rates typically obtained in these settings remain independent of the data distribution, potentially leading to restrictive hypothesis classes and suboptimal sample complexities. We consider training data generated by an iterated random function that need not be irreducible or aperiodic. Assuming the governing function is contractive in its first argument and subject to certain regularity conditions on the hypothesis class, we first establish uniform convergence for the sample error. We then prove learnability of approximate empirical risk minimization and derive its learning rate bound. Both bounds depend explicitly on the data distribution through the Rademacher complexities of the hypothesis class, thereby better capturing properties of the data-generating distribution.</div></div>","PeriodicalId":50227,"journal":{"name":"Journal of Complexity","volume":"91 ","pages":"Article 101971"},"PeriodicalIF":1.8,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}