Cubic Regularized ADMM with Convergence to a Local Minimum in Non-convex Optimization

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton) Pub Date : 2019-09-01 DOI:10.1109/ALLERTON.2019.8919772

Zai Shi, A. Eryilmaz

{"title":"Cubic Regularized ADMM with Convergence to a Local Minimum in Non-convex Optimization","authors":"Zai Shi, A. Eryilmaz","doi":"10.1109/ALLERTON.2019.8919772","DOIUrl":null,"url":null,"abstract":"How to escape saddle points is a critical issue in non-convex optimization. Previous methods on this issue mainly assume that the objective function is Hessian-Lipschitz, which leave a gap for applications using non-Hessian-Lipschitz functions. In this paper, we propose Cubic Regularized Alternating Direction Method of Multipliers (CR-ADMM) to escape saddle points of separable non-convex functions containing a non-Hessian-Lipschitz component. By carefully choosing a parameter, we prove that CR-ADMM converges to a local minimum of the original function with a rate of $O(1 /T^{1/3})$ in time horizon T, which is faster than gradient-based methods. We also show that when one or more steps of CR-ADMM are not solved exactly, CRADMM can converge to a neighborhood of the local minimum. Through the experiments of matrix factorization problems, CRADMM is shown to have a faster rate and a lower optimality gap compared with other gradient-based methods. Our approach can also find applications in other scenarios where regularized non-convex cost minimization is performed, such as parameter optimization of deep neural networks.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2019.8919772","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

How to escape saddle points is a critical issue in non-convex optimization. Previous methods on this issue mainly assume that the objective function is Hessian-Lipschitz, which leave a gap for applications using non-Hessian-Lipschitz functions. In this paper, we propose Cubic Regularized Alternating Direction Method of Multipliers (CR-ADMM) to escape saddle points of separable non-convex functions containing a non-Hessian-Lipschitz component. By carefully choosing a parameter, we prove that CR-ADMM converges to a local minimum of the original function with a rate of $O(1 /T^{1/3})$ in time horizon T, which is faster than gradient-based methods. We also show that when one or more steps of CR-ADMM are not solved exactly, CRADMM can converge to a neighborhood of the local minimum. Through the experiments of matrix factorization problems, CRADMM is shown to have a faster rate and a lower optimality gap compared with other gradient-based methods. Our approach can also find applications in other scenarios where regularized non-convex cost minimization is performed, such as parameter optimization of deep neural networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

非凸优化中收敛到局部极小的三次正则ADMM

如何摆脱鞍点是非凸优化中的一个关键问题。以往的方法主要假设目标函数为Hessian-Lipschitz函数，这给非Hessian-Lipschitz函数的应用留下了空白。在本文中，我们提出了三次正则化交替方向乘法器(CR-ADMM)来逃避包含非hessian - lipschitz分量的可分离非凸函数的鞍点。通过仔细选择参数，我们证明了CR-ADMM在时间范围T内收敛到原始函数的局部极小值，速度为$O(1 /T^{1/3})$，比基于梯度的方法更快。我们还证明了当CR-ADMM的一个或多个步骤没有精确求解时，CRADMM可以收敛到局部最小值的邻域。通过矩阵分解问题的实验表明，与其他基于梯度的方法相比，CRADMM具有更快的速度和更小的最优性差距。我们的方法也可以在执行正则化非凸成本最小化的其他场景中找到应用，例如深度神经网络的参数优化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

自引率

0.00%

发文量