Pub Date : 2025-09-08DOI: 10.1016/j.acha.2025.101810
Małgorzata Bogdan , Xavier Dupuis , Piotr Graczyk , Bartosz Kołodziejek , Tomasz Skalski , Patrick Tardivel , Maciej Wilczyński
SLOPE is a popular method for dimensionality reduction in high-dimensional regression. Its estimated coefficients can be zero, yielding sparsity, or equal in absolute value, yielding clustering. As a result, SLOPE can eliminate irrelevant predictors and identify groups of predictors that have the same influence on the response. The concept of the SLOPE pattern allows us to formalize and study its sparsity and clustering properties. In particular, the SLOPE pattern of a coefficient vector captures the signs of its components (positive, negative, or zero), the clusters (groups of coefficients with the same absolute value), and the ranking of those clusters. This is the first paper to thoroughly investigate the consistency of the SLOPE pattern. We establish necessary and sufficient conditions for SLOPE pattern recovery, which in turn enable the derivation of an irrepresentability condition for SLOPE given a fixed design matrix . These results lay the groundwork for a comprehensive asymptotic analysis of SLOPE pattern consistency.
{"title":"Pattern recovery by SLOPE","authors":"Małgorzata Bogdan , Xavier Dupuis , Piotr Graczyk , Bartosz Kołodziejek , Tomasz Skalski , Patrick Tardivel , Maciej Wilczyński","doi":"10.1016/j.acha.2025.101810","DOIUrl":"10.1016/j.acha.2025.101810","url":null,"abstract":"<div><div>SLOPE is a popular method for dimensionality reduction in high-dimensional regression. Its estimated coefficients can be zero, yielding sparsity, or equal in absolute value, yielding clustering. As a result, SLOPE can eliminate irrelevant predictors and identify groups of predictors that have the same influence on the response. The concept of the SLOPE pattern allows us to formalize and study its sparsity and clustering properties. In particular, the SLOPE pattern of a coefficient vector captures the signs of its components (positive, negative, or zero), the clusters (groups of coefficients with the same absolute value), and the ranking of those clusters. This is the first paper to thoroughly investigate the consistency of the SLOPE pattern. We establish necessary and sufficient conditions for SLOPE pattern recovery, which in turn enable the derivation of an irrepresentability condition for SLOPE given a fixed design matrix <span><math><mi>X</mi></math></span>. These results lay the groundwork for a comprehensive asymptotic analysis of SLOPE pattern consistency.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"80 ","pages":"Article 101810"},"PeriodicalIF":3.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-28DOI: 10.1016/j.acha.2025.101801
Daniel Freeman , Daniel Haider
The injectivity of ReLU layers in neural networks, the recovery of vectors from clipped or saturated measurements, and (real) phase retrieval in allow for a similar problem formulation and characterization using frame theory. In this paper, we revisit all three problems with a unified perspective and derive lower Lipschitz bounds for ReLU layers and clipping which are analogous to the previously known result for phase retrieval and are optimal up to a constant factor.
{"title":"Optimal lower Lipschitz bounds for ReLU layers, saturation, and phase retrieval","authors":"Daniel Freeman , Daniel Haider","doi":"10.1016/j.acha.2025.101801","DOIUrl":"10.1016/j.acha.2025.101801","url":null,"abstract":"<div><div>The injectivity of ReLU layers in neural networks, the recovery of vectors from clipped or saturated measurements, and (real) phase retrieval in <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>n</mi></mrow></msup></math></span> allow for a similar problem formulation and characterization using frame theory. In this paper, we revisit all three problems with a unified perspective and derive lower Lipschitz bounds for ReLU layers and clipping which are analogous to the previously known result for phase retrieval and are optimal up to a constant factor.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"80 ","pages":"Article 101801"},"PeriodicalIF":3.2,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144921299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1016/j.acha.2025.101802
Lexing Ying
This note considers the spectral estimation problems of sparse spectral measures under unknown noise levels. The main technical tool is the eigenmatrix method for solving unstructured sparse recovery problems. When the noise level is determined, the free deconvolution reduces the problem to an unstructured sparse recovery problem to which the eigenmatrix method can be applied. To determine the unknown noise level, we propose an optimization problem based on the singular values of an intermediate matrix of the eigenmatrix method. Numerical results are provided for both the additive and multiplicative free deconvolutions.
{"title":"Sparse free deconvolution under unknown noise level via eigenmatrix","authors":"Lexing Ying","doi":"10.1016/j.acha.2025.101802","DOIUrl":"10.1016/j.acha.2025.101802","url":null,"abstract":"<div><div>This note considers the spectral estimation problems of sparse spectral measures under unknown noise levels. The main technical tool is the eigenmatrix method for solving unstructured sparse recovery problems. When the noise level is determined, the free deconvolution reduces the problem to an unstructured sparse recovery problem to which the eigenmatrix method can be applied. To determine the unknown noise level, we propose an optimization problem based on the singular values of an intermediate matrix of the eigenmatrix method. Numerical results are provided for both the additive and multiplicative free deconvolutions.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101802"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144865562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1016/j.acha.2025.101803
Shashank Sule , Luke Evans , Maria Cameron
We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using δ-nets as spatially uniform inputs to the TMDmap algorithm.
我们获得了目标测量扩散图(TMDmap)一致性误差的渐近尖锐误差估计(Banisch et al. 2020),这是扩散图的一种变体,具有重要采样功能,因此允许从任意密度提取输入数据。得到的误差估计包括偏置误差和方差误差。所得的收敛速率符合图拉普拉斯算子的近似理论。我们的结果的关键新颖之处在于对所有导序项上的前因子的显式量化。我们还证明了用TMDmap得到的Dirichlet bvp解的误差估计,表明解的误差是由一致性误差控制的。我们利用这些结果研究了TMDmap在利用过渡路径理论(TPT)框架分析由过阻尼朗格万动力学控制的系统中的罕见事件中的重要应用。TPT的基石是解决提交者问题,即后向Kolmogorov PDE的边值问题。值得注意的是,我们发现TMDmap算法特别适合作为提交问题的无网格求解器,因为它取消了前因子公式中的几个误差项。此外,当使用准均匀采样密度时,偏差和方差误差会得到显著改善。我们的数值实验表明,当使用δ-nets作为空间均匀输入到TMDmap算法时,这些精度的提高在实践中是可以实现的。
{"title":"Sharp error estimates for target measure diffusion maps with applications to the committor problem","authors":"Shashank Sule , Luke Evans , Maria Cameron","doi":"10.1016/j.acha.2025.101803","DOIUrl":"10.1016/j.acha.2025.101803","url":null,"abstract":"<div><div>We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using <em>δ</em>-nets as spatially uniform inputs to the TMDmap algorithm.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101803"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144885793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1016/j.acha.2025.101800
Tim Laux , Jona Lelmi
In this work we present the first rigorous analysis of the MBO scheme for data clustering in the large data limit. Each iteration of the scheme corresponds to one step of implicit gradient descent for the thresholding energy on the similarity graph of some dataset. For a subset of the nodes of the graph, the thresholding energy at time h measures the amount of heat transferred from the subset to its complement at time h, rescaled by a factor . It is then natural to think that outcomes of the MBO scheme are (local) minimizers of this energy. We prove that the algorithm is consistent, in the sense that these (local) minimizers converge to (local) minimizers of a suitably weighted optimal partition problem.
{"title":"Large data limit of the MBO scheme for data clustering: Γ-convergence of the thresholding energies","authors":"Tim Laux , Jona Lelmi","doi":"10.1016/j.acha.2025.101800","DOIUrl":"10.1016/j.acha.2025.101800","url":null,"abstract":"<div><div>In this work we present the first rigorous analysis of the MBO scheme for data clustering in the large data limit. Each iteration of the scheme corresponds to one step of implicit gradient descent for the thresholding energy on the similarity graph of some dataset. For a subset of the nodes of the graph, the thresholding energy at time <em>h</em> measures the amount of heat transferred from the subset to its complement at time <em>h</em>, rescaled by a factor <span><math><msqrt><mrow><mi>h</mi></mrow></msqrt></math></span>. It is then natural to think that outcomes of the MBO scheme are (local) minimizers of this energy. We prove that the algorithm is consistent, in the sense that these (local) minimizers converge to (local) minimizers of a suitably weighted optimal partition problem.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101800"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1016/j.acha.2025.101799
Patrik Wahlberg
We define the Wigner distribution of a tempered generalized stochastic process that is complex-valued symmetric Gaussian. This gives a time-frequency generalized stochastic process defined on the phase space. We study its covariance and our main result is a formula for the Weyl symbol of the covariance operator, expressed in terms of the Weyl symbol of the covariance operator of the original generalized stochastic process.
{"title":"The Wigner distribution of Gaussian tempered generalized stochastic processes","authors":"Patrik Wahlberg","doi":"10.1016/j.acha.2025.101799","DOIUrl":"10.1016/j.acha.2025.101799","url":null,"abstract":"<div><div>We define the Wigner distribution of a tempered generalized stochastic process that is complex-valued symmetric Gaussian. This gives a time-frequency generalized stochastic process defined on the phase space. We study its covariance and our main result is a formula for the Weyl symbol of the covariance operator, expressed in terms of the Weyl symbol of the covariance operator of the original generalized stochastic process.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101799"},"PeriodicalIF":3.2,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144861221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-06DOI: 10.1016/j.acha.2025.101798
Radu Balan , Naveed Haghani , Maneesh Singh
This paper presents primarily two Euclidean embeddings of the quotient space generated by matrices that are identified modulo arbitrary row permutations. The original application is in deep learning on graphs where the learning task is invariant to node relabeling. Two embedding schemes are introduced, one based on sorting and the other based on algebras of multivariate polynomials. While both embeddings exhibit a computational complexity exponential in problem size, the sorting based embedding is globally bi-Lipschitz and admits a low dimensional target space. Additionally, an almost everywhere injective scheme can be implemented with minimal redundancy and low computational cost. In turn, this proves that almost any classifier can be implemented with an arbitrary small loss of performance. Numerical experiments are carried out on two datasets, a chemical compound dataset (QM9) and a proteins dataset (PROTEINS_FULL).
{"title":"Permutation-invariant representations with applications to graph deep learning","authors":"Radu Balan , Naveed Haghani , Maneesh Singh","doi":"10.1016/j.acha.2025.101798","DOIUrl":"10.1016/j.acha.2025.101798","url":null,"abstract":"<div><div>This paper presents primarily two Euclidean embeddings of the quotient space generated by matrices that are identified modulo arbitrary row permutations. The original application is in deep learning on graphs where the learning task is invariant to node relabeling. Two embedding schemes are introduced, one based on sorting and the other based on algebras of multivariate polynomials. While both embeddings exhibit a computational complexity exponential in problem size, the sorting based embedding is globally bi-Lipschitz and admits a low dimensional target space. Additionally, an almost everywhere injective scheme can be implemented with minimal redundancy and low computational cost. In turn, this proves that almost any classifier can be implemented with an arbitrary small loss of performance. Numerical experiments are carried out on two datasets, a chemical compound dataset (<span>QM9</span>) and a proteins dataset (<span>PROTEINS_FULL</span>).</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101798"},"PeriodicalIF":3.2,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144809570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-16DOI: 10.1016/j.acha.2025.101795
Yunwen Lei , Tao Sun , Mingrui Liu
The increasing scale of data propels the popularity of leveraging parallelism to speed up the optimization. Minibatch stochastic gradient descent (minibatch SGD) and local SGD are two popular methods for parallel optimization. The existing theoretical studies show a linear speedup of these methods with respect to the number of machines, which, however, is measured by optimization errors in a multi-pass setting. As a comparison, the stability and generalization of these methods are much less studied. In this paper, we study the stability and generalization analysis of minibatch and local SGD to understand their learnability by introducing an expectation-variance decomposition. We incorporate training errors into the stability analysis, which shows how small training errors help generalization for overparameterized models. We show minibatch and local SGD achieve a linear speedup to attain the optimal risk bounds.
{"title":"Minibatch and local SGD: Algorithmic stability and linear speedup in generalization","authors":"Yunwen Lei , Tao Sun , Mingrui Liu","doi":"10.1016/j.acha.2025.101795","DOIUrl":"10.1016/j.acha.2025.101795","url":null,"abstract":"<div><div>The increasing scale of data propels the popularity of leveraging parallelism to speed up the optimization. Minibatch stochastic gradient descent (minibatch SGD) and local SGD are two popular methods for parallel optimization. The existing theoretical studies show a linear speedup of these methods with respect to the number of machines, which, however, is measured by optimization errors in a multi-pass setting. As a comparison, the stability and generalization of these methods are much less studied. In this paper, we study the stability and generalization analysis of minibatch and local SGD to understand their learnability by introducing an expectation-variance decomposition. We incorporate training errors into the stability analysis, which shows how small training errors help generalization for overparameterized models. We show minibatch and local SGD achieve a linear speedup to attain the optimal risk bounds.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101795"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-16DOI: 10.1016/j.acha.2025.101796
Dorian Florescu, Ayush Bhandari
In this paper we introduce a new sampling and reconstruction approach for multi-dimensional analog signals. Building on top of the Unlimited Sensing Framework (USF), we present a new folded sampling operator called the multi-dimensional modulo-hysteresis that is also backwards compatible with the existing one-dimensional modulo operator. Unlike previous approaches, the proposed model is specifically tailored to multi-dimensional signals. In particular, the model uses certain redundancy in dimensions 2 and above, which is exploited for input recovery with robustness. We prove that the new operator is well-defined and its outputs have a bounded dynamic range. For the noiseless case, we derive a theoretically guaranteed input reconstruction approach. When the input is corrupted by Gaussian noise, we exploit redundancy in higher dimensions to provide a bound on the error probability and show this drops to 0 for high enough sampling rates leading to new theoretical guarantees for the noisy case. Our numerical examples corroborate the theoretical results and show that the proposed approach can handle a significantly larger amount of noise compared to USF.
{"title":"Multi-dimensional unlimited sampling and robust reconstruction","authors":"Dorian Florescu, Ayush Bhandari","doi":"10.1016/j.acha.2025.101796","DOIUrl":"10.1016/j.acha.2025.101796","url":null,"abstract":"<div><div>In this paper we introduce a new sampling and reconstruction approach for multi-dimensional analog signals. Building on top of the Unlimited Sensing Framework (USF), we present a new folded sampling operator called the multi-dimensional modulo-hysteresis that is also backwards compatible with the existing one-dimensional modulo operator. Unlike previous approaches, the proposed model is specifically tailored to multi-dimensional signals. In particular, the model uses certain redundancy in dimensions 2 and above, which is exploited for input recovery with robustness. We prove that the new operator is well-defined and its outputs have a bounded dynamic range. For the noiseless case, we derive a theoretically guaranteed input reconstruction approach. When the input is corrupted by Gaussian noise, we exploit redundancy in higher dimensions to provide a bound on the error probability and show this drops to 0 for high enough sampling rates leading to new theoretical guarantees for the noisy case. Our numerical examples corroborate the theoretical results and show that the proposed approach can handle a significantly larger amount of noise compared to USF.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101796"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144665025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-16DOI: 10.1016/j.acha.2025.101797
Yunfei Yang
This paper studies the problem of how efficiently functions in the Sobolev spaces and Besov spaces can be approximated by deep ReLU neural networks with width W and depth L, when the error is measured in the norm. This problem has been studied by several recent works, which obtained the approximation rate up to logarithmic factors when , and the rate for networks with fixed width when the Sobolev embedding condition holds. We generalize these results by showing that the rate indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.
{"title":"On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks","authors":"Yunfei Yang","doi":"10.1016/j.acha.2025.101797","DOIUrl":"10.1016/j.acha.2025.101797","url":null,"abstract":"<div><div>This paper studies the problem of how efficiently functions in the Sobolev spaces <span><math><msup><mrow><mi>W</mi></mrow><mrow><mi>s</mi><mo>,</mo><mi>q</mi></mrow></msup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> and Besov spaces <span><math><msubsup><mrow><mi>B</mi></mrow><mrow><mi>q</mi><mo>,</mo><mi>r</mi></mrow><mrow><mi>s</mi></mrow></msubsup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> can be approximated by deep ReLU neural networks with width <em>W</em> and depth <em>L</em>, when the error is measured in the <span><math><msup><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> norm. This problem has been studied by several recent works, which obtained the approximation rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><mi>W</mi><mi>L</mi><mo>)</mo></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> up to logarithmic factors when <span><math><mi>p</mi><mo>=</mo><mi>q</mi><mo>=</mo><mo>∞</mo></math></span>, and the rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mi>L</mi></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> for networks with fixed width when the Sobolev embedding condition <span><math><mn>1</mn><mo>/</mo><mi>q</mi><mo>−</mo><mn>1</mn><mo>/</mo><mi>p</mi><mo><</mo><mi>s</mi><mo>/</mo><mi>d</mi></math></span> holds. We generalize these results by showing that the rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><mi>W</mi><mi>L</mi><mo>)</mo></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101797"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}