鞍点附近梯度下降轨迹近似的退出时间分析

IF 1.6 4区数学 Q2 MATHEMATICS, APPLIED Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI:10.1093/imaiai/iaac025

Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa

{"title":"鞍点附近梯度下降轨迹近似的退出时间分析","authors":"Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa","doi":"10.1093/imaiai/iaac025","DOIUrl":null,"url":null,"abstract":"This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"714-786"},"PeriodicalIF":1.6000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points\",\"authors\":\"Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa\",\"doi\":\"10.1093/imaiai/iaac025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.\",\"PeriodicalId\":45437,\"journal\":{\"name\":\"Information and Inference-A Journal of the Ima\",\"volume\":\"12 2\",\"pages\":\"714-786\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Inference-A Journal of the Ima\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10058609/\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://ieeexplore.ieee.org/document/10058609/","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 3

摘要

本文考虑了在一些初始边界条件下，从鞍邻域理解梯度相关一阶方法轨迹的退出时间的问题。考虑到鞍点周围的“平坦”几何结构，由于遇到的梯度幅度较小，一阶方法可能难以快速逃离这些区域。特别地，虽然已知梯度相关的一阶方法避开了严格的鞍邻域，但现有的分析技术并没有明确地利用鞍点周围的局部几何来控制梯度轨迹的行为。正是在这种背景下，本文利用矩阵摄动理论对严格鞍邻域周围的梯度下降方法进行了严格的几何分析。在这样做的过程中，它提供了一个关键结果，可用于生成任何给定初始条件的近似梯度轨迹。此外，分析得出了梯度下降法在某些必要的初始条件下的线性退出时间解，明确地给出了一类严格鞍函数对问题维数、鞍邻域条件等的依赖性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points

This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information and Inference-A Journal of the Ima Multiple-

CiteScore

3.90

自引率

0.00%

发文量