{"title":"非光滑随机凸规划的光滑变样本量加速逼近方法","authors":"A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn","doi":"10.1287/stsy.2022.0095","DOIUrl":null,"url":null,"abstract":"We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs\",\"authors\":\"A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn\",\"doi\":\"10.1287/stsy.2022.0095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.\",\"PeriodicalId\":36337,\"journal\":{\"name\":\"Stochastic Systems\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Stochastic Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1287/stsy.2022.0095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stochastic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/stsy.2022.0095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 8
摘要
考虑函数F的无约束极小化问题,其中F = F + g, F是一个期望值非光滑凸函数或强凸函数,g是一个闭凸固有函数。(I)强凸f。当f在x中为-强凸时,传统的随机亚梯度方案(SSG)通常表现出较差的行为,部分原因是由于噪声的亚梯度和递减的步长。相反,我们在F (F的莫罗包络)上应用了变样本量加速近端方案(VS-APM);我们将这种方案称为(mVS-APM),与(SSG)方案相比,(mVS-APM)利用恒定的步长和越来越精确的梯度。我们考虑两种情况。(a)有界域。在这种情况下,(mVS-APM)在不精确的梯度步骤中显示线性收敛,每个步骤都需要使用内部(prox-SSG)方案。具体来说,(mVS-APM)在[公式:见文]的prox-SSG步骤中实现了最优的oracle复杂度,在F的不精确(外部)梯度中实现了[公式:见文]的迭代复杂度,从而通过增加内部(随机)子梯度步骤的数量来实现均方误差的精确解;(b)无界域。在这种情况下,在子梯度上的状态依赖边界假设下,非加速变量(mVS-APM)是线性收敛的,其中越来越精确的梯度∇xF(x)通过(SSG)格式以越来越高的精度逼近。值得注意的是,(mVS-APM)也显示了最优的oracle复杂性[公式:见文本];(II)凸f。当f仅为凸但平滑时,通过对平滑序列、步长序列和批大小序列的适当选择,smooththed (VS-APM)(或sVS-APM)达到最优的oracle复杂度为[公式:见文],从而得到一个-最优解。我们的结果可以专门用于两个重要的情况:(a)平滑f.由于不再需要平滑,我们观察到(VS-APM)承认最优速率和oracle复杂性,匹配先验结果;f.在非光滑确定性区域,(sVS-APM)简化为光滑加速近端方法(s-APM),它是渐近收敛和最优的,因为它显示出[公式:见文]的复杂性,匹配Nesterov在2005年提供的产生-最优解的界。最后,(sVS-APM)和(VS-APM)产生的序列几乎肯定收敛于原问题的一个解。
We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.