利用占用时间适应性扰动高效逃离鞍点

Xin Guo , Jiequn Han , Mahan Tajrobehkar , Wenpin Tang
{"title":"利用占用时间适应性扰动高效逃离鞍点","authors":"Xin Guo ,&nbsp;Jiequn Han ,&nbsp;Mahan Tajrobehkar ,&nbsp;Wenpin Tang","doi":"10.1016/j.jcmds.2024.100090","DOIUrl":null,"url":null,"abstract":"<div><p>Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are guaranteed to avoid getting stuck at non-degenerate saddle points, and are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, and several state-of-the-art adaptive gradient methods.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100090"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000014/pdfft?md5=ef92b7ba4259b7a90a297dea99cfb00a&pid=1-s2.0-S2772415824000014-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Escaping saddle points efficiently with occupation-time-adapted perturbations\",\"authors\":\"Xin Guo ,&nbsp;Jiequn Han ,&nbsp;Mahan Tajrobehkar ,&nbsp;Wenpin Tang\",\"doi\":\"10.1016/j.jcmds.2024.100090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are guaranteed to avoid getting stuck at non-degenerate saddle points, and are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, and several state-of-the-art adaptive gradient methods.</p></div>\",\"PeriodicalId\":100768,\"journal\":{\"name\":\"Journal of Computational Mathematics and Data Science\",\"volume\":\"10 \",\"pages\":\"Article 100090\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772415824000014/pdfft?md5=ef92b7ba4259b7a90a297dea99cfb00a&pid=1-s2.0-S2772415824000014-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Mathematics and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772415824000014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Mathematics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772415824000014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

受源于统计物理学的自斥随机漫步超扩散性的启发,本文为优化算法开发了一种新的扰动机制。在这一机制中,扰动通过占用时间的概念来适应状态的历史。在将这一机制融入扰动梯度下降(PGD)和扰动加速梯度下降(PAGD)的框架后,本文提出了两种新算法:适应占用时间的扰动梯度下降(PGDOT)及其加速版本(PAGDOT)。PGDOT 和 PAGDOT 保证避免在非退化鞍点卡住,并分别以至少 PGD 和 PAGD 的速度收敛到二阶静止点。实证研究证实了理论分析的正确性,在实证研究中,新算法始终能摆脱鞍点,其性能不仅优于同类算法 PGD 和 PAGD,还优于其他流行算法,包括随机梯度下降法、亚当法和几种最先进的自适应梯度法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Escaping saddle points efficiently with occupation-time-adapted perturbations

Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are guaranteed to avoid getting stuck at non-degenerate saddle points, and are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, and several state-of-the-art adaptive gradient methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.00
自引率
0.00%
发文量
0
期刊最新文献
Efficiency of the multisection method Bayesian optimization of one-dimensional convolutional neural networks (1D CNN) for early diagnosis of Autistic Spectrum Disorder Novel color space representation extracted by NMF to segment a color image Enhanced MRI brain tumor detection and classification via topological data analysis and low-rank tensor decomposition Artifact removal from ECG signals using online recursive independent component analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1