Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs

Q1 Mathematics Stochastic Systems Pub Date : 2018-03-02 DOI:10.1287/stsy.2022.0095

A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn

{"title":"Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs","authors":"A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn","doi":"10.1287/stsy.2022.0095","DOIUrl":null,"url":null,"abstract":"We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stochastic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/stsy.2022.0095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 8

Abstract

We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

非光滑随机凸规划的光滑变样本量加速逼近方法

考虑函数F的无约束极小化问题，其中F = F + g, F是一个期望值非光滑凸函数或强凸函数，g是一个闭凸固有函数。(I)强凸f。当f在x中为-强凸时，传统的随机亚梯度方案(SSG)通常表现出较差的行为，部分原因是由于噪声的亚梯度和递减的步长。相反，我们在F (F的莫罗包络)上应用了变样本量加速近端方案(VS-APM);我们将这种方案称为(mVS-APM)，与(SSG)方案相比，(mVS-APM)利用恒定的步长和越来越精确的梯度。我们考虑两种情况。(a)有界域。在这种情况下，(mVS-APM)在不精确的梯度步骤中显示线性收敛，每个步骤都需要使用内部(prox-SSG)方案。具体来说，(mVS-APM)在[公式:见文]的prox-SSG步骤中实现了最优的oracle复杂度，在F的不精确(外部)梯度中实现了[公式:见文]的迭代复杂度，从而通过增加内部(随机)子梯度步骤的数量来实现均方误差的精确解;(b)无界域。在这种情况下，在子梯度上的状态依赖边界假设下，非加速变量(mVS-APM)是线性收敛的，其中越来越精确的梯度∇xF(x)通过(SSG)格式以越来越高的精度逼近。值得注意的是，(mVS-APM)也显示了最优的oracle复杂性[公式:见文本];(II)凸f。当f仅为凸但平滑时，通过对平滑序列、步长序列和批大小序列的适当选择，smooththed (VS-APM)(或sVS-APM)达到最优的oracle复杂度为[公式:见文]，从而得到一个-最优解。我们的结果可以专门用于两个重要的情况:(a)平滑f.由于不再需要平滑，我们观察到(VS-APM)承认最优速率和oracle复杂性，匹配先验结果;f.在非光滑确定性区域，(sVS-APM)简化为光滑加速近端方法(s-APM)，它是渐近收敛和最优的，因为它显示出[公式:见文]的复杂性，匹配Nesterov在2005年提供的产生-最优解的界。最后，(sVS-APM)和(VS-APM)产生的序列几乎肯定收敛于原问题的一个解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Stochastic Systems Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

3.70

自引率

0.00%

发文量