Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI:arxiv-2409.09906

Zhaosong Lu, Sanyou Mei, Yifeng Xiao

{"title":"Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees","authors":"Zhaosong Lu, Sanyou Mei, Yifeng Xiao","doi":"arxiv-2409.09906","DOIUrl":null,"url":null,"abstract":"In this paper, we study a class of deterministically constrained stochastic\noptimization problems. Existing methods typically aim to find an\n$\\epsilon$-stochastic stationary point, where the expected violations of both\nthe constraints and first-order stationarity are within a prescribed accuracy\nof $\\epsilon$. However, in many practical applications, it is crucial that the\nconstraints be nearly satisfied with certainty, making such an\n$\\epsilon$-stochastic stationary point potentially undesirable due to the risk\nof significant constraint violations. To address this issue, we propose\nsingle-loop variance-reduced stochastic first-order methods, where the\nstochastic gradient of the stochastic component is computed using either a\ntruncated recursive momentum scheme or a truncated Polyak momentum scheme for\nvariance reduction, while the gradient of the deterministic component is\ncomputed exactly. Under the error bound condition with a parameter $\\theta \\geq\n1$ and other suitable assumptions, we establish that the proposed methods\nachieve a sample complexity and first-order operation complexity of $\\widetilde\nO(\\epsilon^{-\\max\\{4, 2\\theta\\}})$ for finding a stronger $\\epsilon$-stochastic\nstationary point, where the constraint violation is within $\\epsilon$ with\ncertainty, and the expected violation of first-order stationarity is within\n$\\epsilon$. To the best of our knowledge, this is the first work to develop\nmethods with provable complexity guarantees for finding an approximate\nstochastic stationary point of such problems that nearly satisfies all\nconstraints with certainty.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we study a class of deterministically constrained stochastic optimization problems. Existing methods typically aim to find an $\epsilon$-stochastic stationary point, where the expected violations of both the constraints and first-order stationarity are within a prescribed accuracy of $\epsilon$. However, in many practical applications, it is crucial that the constraints be nearly satisfied with certainty, making such an $\epsilon$-stochastic stationary point potentially undesirable due to the risk of significant constraint violations. To address this issue, we propose single-loop variance-reduced stochastic first-order methods, where the stochastic gradient of the stochastic component is computed using either a truncated recursive momentum scheme or a truncated Polyak momentum scheme for variance reduction, while the gradient of the deterministic component is computed exactly. Under the error bound condition with a parameter $\theta \geq 1$ and other suitable assumptions, we establish that the proposed methods achieve a sample complexity and first-order operation complexity of $\widetilde O(\epsilon^{-\max\{4, 2\theta\}})$ for finding a stronger $\epsilon$-stochastic stationary point, where the constraint violation is within $\epsilon$ with certainty, and the expected violation of first-order stationarity is within $\epsilon$. To the best of our knowledge, this is the first work to develop methods with provable complexity guarantees for finding an approximate stochastic stationary point of such problems that nearly satisfies all constraints with certainty.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有强收敛保证的确定性约束随机非凸优化的方差缩小一阶方法

本文研究了一类确定性约束随机优化问题。现有方法通常旨在找到一个$epsilon$随机静止点，在这个点上，对约束条件和一阶静止性的预期违反都在$\epsilon$的规定精度之内。然而，在许多实际应用中，约束条件必须近乎确定无疑地得到满足，这就使得这种$epsilon$-随机静止点可能不可取，因为存在严重违反约束条件的风险。为了解决这个问题，我们提出了单环方差降低随机一阶方法，其中随机分量的随机梯度使用截断递归动量方案或截断波利亚克动量方案进行方差降低计算，而确定分量的梯度则精确计算。在参数为 $\theta \geq1$ 和其他合适假设的误差约束条件下，我们确定了所提出方法的采样复杂度和一阶运算复杂度为 $\widetildeO(\epsilon^{-\max\{4、(\epsilon^{-\max\{4, 2\theta\})$)$来找到一个更强的$\epsilon$-随机静止点，其中违反约束的情况在$\epsilon$以内，并且预期违反一阶静止性的情况在$\epsilon$以内。据我们所知，这是第一部为寻找这类问题的近似随机静止点而开发具有可证明复杂性保证的方法的著作，该方法几乎可以肯定地满足所有约束条件。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - STAT - Machine Learning

自引率

0.00%

发文量

期刊最新文献

Fitting Multilevel Factor Models Cartan moving frames and the data manifolds Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Recurrent Interpolants for Probabilistic Time Series Prediction PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities