首页 > 最新文献

Applied and Computational Harmonic Analysis最新文献

英文 中文
Pattern recovery by SLOPE 利用斜率恢复模式
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-09-08 DOI: 10.1016/j.acha.2025.101810
Małgorzata Bogdan , Xavier Dupuis , Piotr Graczyk , Bartosz Kołodziejek , Tomasz Skalski , Patrick Tardivel , Maciej Wilczyński
SLOPE is a popular method for dimensionality reduction in high-dimensional regression. Its estimated coefficients can be zero, yielding sparsity, or equal in absolute value, yielding clustering. As a result, SLOPE can eliminate irrelevant predictors and identify groups of predictors that have the same influence on the response. The concept of the SLOPE pattern allows us to formalize and study its sparsity and clustering properties. In particular, the SLOPE pattern of a coefficient vector captures the signs of its components (positive, negative, or zero), the clusters (groups of coefficients with the same absolute value), and the ranking of those clusters. This is the first paper to thoroughly investigate the consistency of the SLOPE pattern. We establish necessary and sufficient conditions for SLOPE pattern recovery, which in turn enable the derivation of an irrepresentability condition for SLOPE given a fixed design matrix X. These results lay the groundwork for a comprehensive asymptotic analysis of SLOPE pattern consistency.
SLOPE是高维回归中常用的降维方法。其估计系数可以为零,产生稀疏性,或者绝对值相等,产生聚类。因此,SLOPE可以消除不相关的预测因子,并确定对响应具有相同影响的预测因子组。SLOPE模式的概念允许我们形式化并研究其稀疏性和聚类属性。特别是,系数向量的SLOPE模式捕获其分量的符号(正、负或零)、聚类(具有相同绝对值的系数组)以及这些聚类的排名。这是第一篇深入研究SLOPE模式一致性的论文。我们建立了SLOPE图恢复的充分必要条件,从而推导出给定固定设计矩阵x的SLOPE的不可表示性条件,这些结果为斜率图一致性的全面渐近分析奠定了基础。
{"title":"Pattern recovery by SLOPE","authors":"Małgorzata Bogdan ,&nbsp;Xavier Dupuis ,&nbsp;Piotr Graczyk ,&nbsp;Bartosz Kołodziejek ,&nbsp;Tomasz Skalski ,&nbsp;Patrick Tardivel ,&nbsp;Maciej Wilczyński","doi":"10.1016/j.acha.2025.101810","DOIUrl":"10.1016/j.acha.2025.101810","url":null,"abstract":"<div><div>SLOPE is a popular method for dimensionality reduction in high-dimensional regression. Its estimated coefficients can be zero, yielding sparsity, or equal in absolute value, yielding clustering. As a result, SLOPE can eliminate irrelevant predictors and identify groups of predictors that have the same influence on the response. The concept of the SLOPE pattern allows us to formalize and study its sparsity and clustering properties. In particular, the SLOPE pattern of a coefficient vector captures the signs of its components (positive, negative, or zero), the clusters (groups of coefficients with the same absolute value), and the ranking of those clusters. This is the first paper to thoroughly investigate the consistency of the SLOPE pattern. We establish necessary and sufficient conditions for SLOPE pattern recovery, which in turn enable the derivation of an irrepresentability condition for SLOPE given a fixed design matrix <span><math><mi>X</mi></math></span>. These results lay the groundwork for a comprehensive asymptotic analysis of SLOPE pattern consistency.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"80 ","pages":"Article 101810"},"PeriodicalIF":3.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal lower Lipschitz bounds for ReLU layers, saturation, and phase retrieval 最优下Lipschitz边界的ReLU层,饱和度和相位检索
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-28 DOI: 10.1016/j.acha.2025.101801
Daniel Freeman , Daniel Haider
The injectivity of ReLU layers in neural networks, the recovery of vectors from clipped or saturated measurements, and (real) phase retrieval in Rn allow for a similar problem formulation and characterization using frame theory. In this paper, we revisit all three problems with a unified perspective and derive lower Lipschitz bounds for ReLU layers and clipping which are analogous to the previously known result for phase retrieval and are optimal up to a constant factor.
神经网络中ReLU层的注入性,从裁剪或饱和测量中恢复向量,以及Rn中的(真实)相位检索允许使用框架理论进行类似的问题表述和表征。在本文中,我们以统一的视角重新审视了这三个问题,并推导了ReLU层和裁剪的下Lipschitz界,这类似于先前已知的相位检索结果,并且在常量因子下是最优的。
{"title":"Optimal lower Lipschitz bounds for ReLU layers, saturation, and phase retrieval","authors":"Daniel Freeman ,&nbsp;Daniel Haider","doi":"10.1016/j.acha.2025.101801","DOIUrl":"10.1016/j.acha.2025.101801","url":null,"abstract":"<div><div>The injectivity of ReLU layers in neural networks, the recovery of vectors from clipped or saturated measurements, and (real) phase retrieval in <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>n</mi></mrow></msup></math></span> allow for a similar problem formulation and characterization using frame theory. In this paper, we revisit all three problems with a unified perspective and derive lower Lipschitz bounds for ReLU layers and clipping which are analogous to the previously known result for phase retrieval and are optimal up to a constant factor.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"80 ","pages":"Article 101801"},"PeriodicalIF":3.2,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144921299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse free deconvolution under unknown noise level via eigenmatrix 基于特征矩阵的未知噪声下的稀疏自由反卷积
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-14 DOI: 10.1016/j.acha.2025.101802
Lexing Ying
This note considers the spectral estimation problems of sparse spectral measures under unknown noise levels. The main technical tool is the eigenmatrix method for solving unstructured sparse recovery problems. When the noise level is determined, the free deconvolution reduces the problem to an unstructured sparse recovery problem to which the eigenmatrix method can be applied. To determine the unknown noise level, we propose an optimization problem based on the singular values of an intermediate matrix of the eigenmatrix method. Numerical results are provided for both the additive and multiplicative free deconvolutions.
本文研究未知噪声水平下稀疏谱测度的谱估计问题。求解非结构化稀疏恢复问题的主要技术工具是特征矩阵法。当噪声水平确定后,自由反褶积将问题简化为可应用特征矩阵方法的非结构化稀疏恢复问题。为了确定未知噪声水平,我们提出了一个基于特征矩阵法中间矩阵奇异值的优化问题。给出了加性和乘性自由反卷积的数值结果。
{"title":"Sparse free deconvolution under unknown noise level via eigenmatrix","authors":"Lexing Ying","doi":"10.1016/j.acha.2025.101802","DOIUrl":"10.1016/j.acha.2025.101802","url":null,"abstract":"<div><div>This note considers the spectral estimation problems of sparse spectral measures under unknown noise levels. The main technical tool is the eigenmatrix method for solving unstructured sparse recovery problems. When the noise level is determined, the free deconvolution reduces the problem to an unstructured sparse recovery problem to which the eigenmatrix method can be applied. To determine the unknown noise level, we propose an optimization problem based on the singular values of an intermediate matrix of the eigenmatrix method. Numerical results are provided for both the additive and multiplicative free deconvolutions.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101802"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144865562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sharp error estimates for target measure diffusion maps with applications to the committor problem 针对提交者问题的应用程序的目标度量扩散映射的精确误差估计
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-14 DOI: 10.1016/j.acha.2025.101803
Shashank Sule , Luke Evans , Maria Cameron
We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using δ-nets as spatially uniform inputs to the TMDmap algorithm.
我们获得了目标测量扩散图(TMDmap)一致性误差的渐近尖锐误差估计(Banisch et al. 2020),这是扩散图的一种变体,具有重要采样功能,因此允许从任意密度提取输入数据。得到的误差估计包括偏置误差和方差误差。所得的收敛速率符合图拉普拉斯算子的近似理论。我们的结果的关键新颖之处在于对所有导序项上的前因子的显式量化。我们还证明了用TMDmap得到的Dirichlet bvp解的误差估计,表明解的误差是由一致性误差控制的。我们利用这些结果研究了TMDmap在利用过渡路径理论(TPT)框架分析由过阻尼朗格万动力学控制的系统中的罕见事件中的重要应用。TPT的基石是解决提交者问题,即后向Kolmogorov PDE的边值问题。值得注意的是,我们发现TMDmap算法特别适合作为提交问题的无网格求解器,因为它取消了前因子公式中的几个误差项。此外,当使用准均匀采样密度时,偏差和方差误差会得到显著改善。我们的数值实验表明,当使用δ-nets作为空间均匀输入到TMDmap算法时,这些精度的提高在实践中是可以实现的。
{"title":"Sharp error estimates for target measure diffusion maps with applications to the committor problem","authors":"Shashank Sule ,&nbsp;Luke Evans ,&nbsp;Maria Cameron","doi":"10.1016/j.acha.2025.101803","DOIUrl":"10.1016/j.acha.2025.101803","url":null,"abstract":"<div><div>We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using <em>δ</em>-nets as spatially uniform inputs to the TMDmap algorithm.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101803"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144885793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large data limit of the MBO scheme for data clustering: Γ-convergence of the thresholding energies 数据聚类MBO方案的大数据限制:阈值能量Γ-convergence
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-14 DOI: 10.1016/j.acha.2025.101800
Tim Laux , Jona Lelmi
In this work we present the first rigorous analysis of the MBO scheme for data clustering in the large data limit. Each iteration of the scheme corresponds to one step of implicit gradient descent for the thresholding energy on the similarity graph of some dataset. For a subset of the nodes of the graph, the thresholding energy at time h measures the amount of heat transferred from the subset to its complement at time h, rescaled by a factor h. It is then natural to think that outcomes of the MBO scheme are (local) minimizers of this energy. We prove that the algorithm is consistent, in the sense that these (local) minimizers converge to (local) minimizers of a suitably weighted optimal partition problem.
在这项工作中,我们首次提出了在大数据限制下数据聚类的MBO方案的严格分析。该方案的每一次迭代对应于某一数据集的相似图阈值能量隐式梯度下降的一步。对于图中节点的一个子集,h时刻的阈值能量测量了从该子集到h时刻的补体传递的热量,通过因子h重新缩放。然后很自然地认为MBO方案的结果是该能量的(局部)最小值。我们证明了该算法是一致的,即这些(局部)极小值收敛于一个适当加权最优划分问题的(局部)极小值。
{"title":"Large data limit of the MBO scheme for data clustering: Γ-convergence of the thresholding energies","authors":"Tim Laux ,&nbsp;Jona Lelmi","doi":"10.1016/j.acha.2025.101800","DOIUrl":"10.1016/j.acha.2025.101800","url":null,"abstract":"<div><div>In this work we present the first rigorous analysis of the MBO scheme for data clustering in the large data limit. Each iteration of the scheme corresponds to one step of implicit gradient descent for the thresholding energy on the similarity graph of some dataset. For a subset of the nodes of the graph, the thresholding energy at time <em>h</em> measures the amount of heat transferred from the subset to its complement at time <em>h</em>, rescaled by a factor <span><math><msqrt><mrow><mi>h</mi></mrow></msqrt></math></span>. It is then natural to think that outcomes of the MBO scheme are (local) minimizers of this energy. We prove that the algorithm is consistent, in the sense that these (local) minimizers converge to (local) minimizers of a suitably weighted optimal partition problem.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101800"},"PeriodicalIF":3.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Wigner distribution of Gaussian tempered generalized stochastic processes 高斯缓和广义随机过程的Wigner分布
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-13 DOI: 10.1016/j.acha.2025.101799
Patrik Wahlberg
We define the Wigner distribution of a tempered generalized stochastic process that is complex-valued symmetric Gaussian. This gives a time-frequency generalized stochastic process defined on the phase space. We study its covariance and our main result is a formula for the Weyl symbol of the covariance operator, expressed in terms of the Weyl symbol of the covariance operator of the original generalized stochastic process.
我们定义了复值对称高斯的调和广义随机过程的Wigner分布。给出了一个定义在相空间上的时频广义随机过程。我们研究了它的协方差,我们的主要结果是一个协方差算子的Weyl符号的公式,用原始广义随机过程的协方差算子的Weyl符号表示。
{"title":"The Wigner distribution of Gaussian tempered generalized stochastic processes","authors":"Patrik Wahlberg","doi":"10.1016/j.acha.2025.101799","DOIUrl":"10.1016/j.acha.2025.101799","url":null,"abstract":"<div><div>We define the Wigner distribution of a tempered generalized stochastic process that is complex-valued symmetric Gaussian. This gives a time-frequency generalized stochastic process defined on the phase space. We study its covariance and our main result is a formula for the Weyl symbol of the covariance operator, expressed in terms of the Weyl symbol of the covariance operator of the original generalized stochastic process.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101799"},"PeriodicalIF":3.2,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144861221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Permutation-invariant representations with applications to graph deep learning 排列不变表示及其在图深度学习中的应用
IF 3.2 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-08-06 DOI: 10.1016/j.acha.2025.101798
Radu Balan , Naveed Haghani , Maneesh Singh
This paper presents primarily two Euclidean embeddings of the quotient space generated by matrices that are identified modulo arbitrary row permutations. The original application is in deep learning on graphs where the learning task is invariant to node relabeling. Two embedding schemes are introduced, one based on sorting and the other based on algebras of multivariate polynomials. While both embeddings exhibit a computational complexity exponential in problem size, the sorting based embedding is globally bi-Lipschitz and admits a low dimensional target space. Additionally, an almost everywhere injective scheme can be implemented with minimal redundancy and low computational cost. In turn, this proves that almost any classifier can be implemented with an arbitrary small loss of performance. Numerical experiments are carried out on two datasets, a chemical compound dataset (QM9) and a proteins dataset (PROTEINS_FULL).
本文主要给出了由模任意行置换识别的矩阵所产生的商空间的两种欧几里得嵌入。最初的应用是在图上的深度学习,其中学习任务对节点重新标记是不变的。介绍了两种嵌入方案,一种基于排序,另一种基于多元多项式代数。虽然这两种嵌入方法在问题规模上都表现出指数级的计算复杂度,但基于排序的嵌入方法是全局双lipschitz的,并且允许低维目标空间。此外,几乎处处注入方案可以实现最小的冗余和较低的计算成本。反过来,这证明了几乎任何分类器都可以以任意小的性能损失来实现。在化学化合物数据集(QM9)和蛋白质数据集(PROTEINS_FULL)上进行了数值实验。
{"title":"Permutation-invariant representations with applications to graph deep learning","authors":"Radu Balan ,&nbsp;Naveed Haghani ,&nbsp;Maneesh Singh","doi":"10.1016/j.acha.2025.101798","DOIUrl":"10.1016/j.acha.2025.101798","url":null,"abstract":"<div><div>This paper presents primarily two Euclidean embeddings of the quotient space generated by matrices that are identified modulo arbitrary row permutations. The original application is in deep learning on graphs where the learning task is invariant to node relabeling. Two embedding schemes are introduced, one based on sorting and the other based on algebras of multivariate polynomials. While both embeddings exhibit a computational complexity exponential in problem size, the sorting based embedding is globally bi-Lipschitz and admits a low dimensional target space. Additionally, an almost everywhere injective scheme can be implemented with minimal redundancy and low computational cost. In turn, this proves that almost any classifier can be implemented with an arbitrary small loss of performance. Numerical experiments are carried out on two datasets, a chemical compound dataset (<span>QM9</span>) and a proteins dataset (<span>PROTEINS_FULL</span>).</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101798"},"PeriodicalIF":3.2,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144809570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minibatch and local SGD: Algorithmic stability and linear speedup in generalization 小批量和局部SGD:算法的稳定性和泛化的线性加速
IF 2.6 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-07-16 DOI: 10.1016/j.acha.2025.101795
Yunwen Lei , Tao Sun , Mingrui Liu
The increasing scale of data propels the popularity of leveraging parallelism to speed up the optimization. Minibatch stochastic gradient descent (minibatch SGD) and local SGD are two popular methods for parallel optimization. The existing theoretical studies show a linear speedup of these methods with respect to the number of machines, which, however, is measured by optimization errors in a multi-pass setting. As a comparison, the stability and generalization of these methods are much less studied. In this paper, we study the stability and generalization analysis of minibatch and local SGD to understand their learnability by introducing an expectation-variance decomposition. We incorporate training errors into the stability analysis, which shows how small training errors help generalization for overparameterized models. We show minibatch and local SGD achieve a linear speedup to attain the optimal risk bounds.
不断增长的数据规模推动了利用并行性来加速优化的普及。Minibatch stochastic gradient descent (Minibatch SGD)和local SGD是两种比较流行的并行优化方法。现有的理论研究表明,这些方法的线性加速与机器数量有关,然而,这是通过多通道设置中的优化误差来衡量的。相比之下,对这些方法的稳定性和通用性的研究却很少。本文通过引入期望-方差分解,研究了小批量和局部SGD的稳定性和泛化分析,以了解它们的可学习性。我们将训练误差纳入稳定性分析,这表明小的训练误差如何有助于过度参数化模型的泛化。我们证明了小批量和局部SGD实现线性加速以达到最优风险界。
{"title":"Minibatch and local SGD: Algorithmic stability and linear speedup in generalization","authors":"Yunwen Lei ,&nbsp;Tao Sun ,&nbsp;Mingrui Liu","doi":"10.1016/j.acha.2025.101795","DOIUrl":"10.1016/j.acha.2025.101795","url":null,"abstract":"<div><div>The increasing scale of data propels the popularity of leveraging parallelism to speed up the optimization. Minibatch stochastic gradient descent (minibatch SGD) and local SGD are two popular methods for parallel optimization. The existing theoretical studies show a linear speedup of these methods with respect to the number of machines, which, however, is measured by optimization errors in a multi-pass setting. As a comparison, the stability and generalization of these methods are much less studied. In this paper, we study the stability and generalization analysis of minibatch and local SGD to understand their learnability by introducing an expectation-variance decomposition. We incorporate training errors into the stability analysis, which shows how small training errors help generalization for overparameterized models. We show minibatch and local SGD achieve a linear speedup to attain the optimal risk bounds.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101795"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-dimensional unlimited sampling and robust reconstruction 多维无限采样和鲁棒重建
IF 2.6 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-07-16 DOI: 10.1016/j.acha.2025.101796
Dorian Florescu, Ayush Bhandari
In this paper we introduce a new sampling and reconstruction approach for multi-dimensional analog signals. Building on top of the Unlimited Sensing Framework (USF), we present a new folded sampling operator called the multi-dimensional modulo-hysteresis that is also backwards compatible with the existing one-dimensional modulo operator. Unlike previous approaches, the proposed model is specifically tailored to multi-dimensional signals. In particular, the model uses certain redundancy in dimensions 2 and above, which is exploited for input recovery with robustness. We prove that the new operator is well-defined and its outputs have a bounded dynamic range. For the noiseless case, we derive a theoretically guaranteed input reconstruction approach. When the input is corrupted by Gaussian noise, we exploit redundancy in higher dimensions to provide a bound on the error probability and show this drops to 0 for high enough sampling rates leading to new theoretical guarantees for the noisy case. Our numerical examples corroborate the theoretical results and show that the proposed approach can handle a significantly larger amount of noise compared to USF.
本文介绍了一种新的多维模拟信号采样与重构方法。在无限传感框架(USF)的基础上,我们提出了一种新的折叠采样算子,称为多维模滞回,它也向后兼容现有的一维模算子。与以前的方法不同,所提出的模型是专门针对多维信号量身定制的。特别是,该模型在2维及以上使用了一定的冗余,用于鲁棒性的输入恢复。我们证明了新算子是定义良好的,它的输出具有有界的动态范围。对于无噪声情况,我们推导了一种理论上有保证的输入重构方法。当输入被高斯噪声破坏时,我们利用更高维度的冗余来提供错误概率的界限,并表明在足够高的采样率下,错误概率降至0,从而为噪声情况提供了新的理论保证。我们的数值例子证实了理论结果,并表明与USF相比,所提出的方法可以处理大量的噪声。
{"title":"Multi-dimensional unlimited sampling and robust reconstruction","authors":"Dorian Florescu,&nbsp;Ayush Bhandari","doi":"10.1016/j.acha.2025.101796","DOIUrl":"10.1016/j.acha.2025.101796","url":null,"abstract":"<div><div>In this paper we introduce a new sampling and reconstruction approach for multi-dimensional analog signals. Building on top of the Unlimited Sensing Framework (USF), we present a new folded sampling operator called the multi-dimensional modulo-hysteresis that is also backwards compatible with the existing one-dimensional modulo operator. Unlike previous approaches, the proposed model is specifically tailored to multi-dimensional signals. In particular, the model uses certain redundancy in dimensions 2 and above, which is exploited for input recovery with robustness. We prove that the new operator is well-defined and its outputs have a bounded dynamic range. For the noiseless case, we derive a theoretically guaranteed input reconstruction approach. When the input is corrupted by Gaussian noise, we exploit redundancy in higher dimensions to provide a bound on the error probability and show this drops to 0 for high enough sampling rates leading to new theoretical guarantees for the noisy case. Our numerical examples corroborate the theoretical results and show that the proposed approach can handle a significantly larger amount of noise compared to USF.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101796"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144665025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks 基于深度ReLU神经网络的Sobolev和Besov函数的最优逼近
IF 2.6 2区 数学 Q1 MATHEMATICS, APPLIED Pub Date : 2025-07-16 DOI: 10.1016/j.acha.2025.101797
Yunfei Yang
This paper studies the problem of how efficiently functions in the Sobolev spaces Ws,q([0,1]d) and Besov spaces Bq,rs([0,1]d) can be approximated by deep ReLU neural networks with width W and depth L, when the error is measured in the Lp([0,1]d) norm. This problem has been studied by several recent works, which obtained the approximation rate O((WL)2s/d) up to logarithmic factors when p=q=, and the rate O(L2s/d) for networks with fixed width when the Sobolev embedding condition 1/q1/p<s/d holds. We generalize these results by showing that the rate O((WL)2s/d) indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.
本文研究了当误差在Lp([0,1]d)范数中测量时,如何有效地逼近Sobolev空间Ws,q([0,1]d)和Besov空间Bq,rs([0,1]d)中宽度为W,深度为L的深度ReLU神经网络。最近的一些研究已经得到了这一问题,当p=q=∞时,得到了对数因子的近似速率O((WL)−2s/d),当Sobolev嵌入条件1/q−1/p<;s/d成立时,得到了固定宽度网络的近似速率O(L−2s/d)。我们推广了这些结果,证明在Sobolev嵌入条件下,速率O((WL)−2s/d)确实成立。众所周知,这个速率在对数因子范围内是最优的。我们证明的关键工具是使用具有不同宽度和深度的深度ReLU神经网络对稀疏向量进行新的编码,这可能是独立的兴趣。
{"title":"On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks","authors":"Yunfei Yang","doi":"10.1016/j.acha.2025.101797","DOIUrl":"10.1016/j.acha.2025.101797","url":null,"abstract":"<div><div>This paper studies the problem of how efficiently functions in the Sobolev spaces <span><math><msup><mrow><mi>W</mi></mrow><mrow><mi>s</mi><mo>,</mo><mi>q</mi></mrow></msup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> and Besov spaces <span><math><msubsup><mrow><mi>B</mi></mrow><mrow><mi>q</mi><mo>,</mo><mi>r</mi></mrow><mrow><mi>s</mi></mrow></msubsup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> can be approximated by deep ReLU neural networks with width <em>W</em> and depth <em>L</em>, when the error is measured in the <span><math><msup><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msup><mo>(</mo><msup><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow><mrow><mi>d</mi></mrow></msup><mo>)</mo></math></span> norm. This problem has been studied by several recent works, which obtained the approximation rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><mi>W</mi><mi>L</mi><mo>)</mo></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> up to logarithmic factors when <span><math><mi>p</mi><mo>=</mo><mi>q</mi><mo>=</mo><mo>∞</mo></math></span>, and the rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mi>L</mi></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> for networks with fixed width when the Sobolev embedding condition <span><math><mn>1</mn><mo>/</mo><mi>q</mi><mo>−</mo><mn>1</mn><mo>/</mo><mi>p</mi><mo>&lt;</mo><mi>s</mi><mo>/</mo><mi>d</mi></math></span> holds. We generalize these results by showing that the rate <span><math><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><mi>W</mi><mi>L</mi><mo>)</mo></mrow><mrow><mo>−</mo><mn>2</mn><mi>s</mi><mo>/</mo><mi>d</mi></mrow></msup><mo>)</mo></math></span> indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.</div></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"79 ","pages":"Article 101797"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144653252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied and Computational Harmonic Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1