Research on three-step accelerated gradient algorithm in deep learning

IF 0.7 Q3 STATISTICS & PROBABILITY Statistical Theory and Related Fields Pub Date : 2020-11-23 DOI:10.1080/24754269.2020.1846414
Yongqiang Lian, Yincai Tang, Shirong Zhou
{"title":"Research on three-step accelerated gradient algorithm in deep learning","authors":"Yongqiang Lian, Yincai Tang, Shirong Zhou","doi":"10.1080/24754269.2020.1846414","DOIUrl":null,"url":null,"abstract":"Gradient descent (GD) algorithm is the widely used optimisation method in training machine learning and deep learning models. In this paper, based on GD, Polyak's momentum (PM), and Nesterov accelerated gradient (NAG), we give the convergence of the algorithms from an initial value to the optimal value of an objective function in simple quadratic form. Based on the convergence property of the quadratic function, two sister sequences of NAG's iteration and parallel tangent methods in neural networks, the three-step accelerated gradient (TAG) algorithm is proposed, which has three sequences other than two sister sequences. To illustrate the performance of this algorithm, we compare the proposed algorithm with the three other algorithms in quadratic function, high-dimensional quadratic functions, and nonquadratic function. Then we consider to combine the TAG algorithm to the backpropagation algorithm and the stochastic gradient descent algorithm in deep learning. For conveniently facilitate the proposed algorithms, we rewite the R package ‘neuralnet’ and extend it to ‘supneuralnet’. All kinds of deep learning algorithms in this paper are included in ‘supneuralnet’ package. Finally, we show our algorithms are superior to other algorithms in four case studies.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"40 - 57"},"PeriodicalIF":0.7000,"publicationDate":"2020-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24754269.2020.1846414","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Theory and Related Fields","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1080/24754269.2020.1846414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Gradient descent (GD) algorithm is the widely used optimisation method in training machine learning and deep learning models. In this paper, based on GD, Polyak's momentum (PM), and Nesterov accelerated gradient (NAG), we give the convergence of the algorithms from an initial value to the optimal value of an objective function in simple quadratic form. Based on the convergence property of the quadratic function, two sister sequences of NAG's iteration and parallel tangent methods in neural networks, the three-step accelerated gradient (TAG) algorithm is proposed, which has three sequences other than two sister sequences. To illustrate the performance of this algorithm, we compare the proposed algorithm with the three other algorithms in quadratic function, high-dimensional quadratic functions, and nonquadratic function. Then we consider to combine the TAG algorithm to the backpropagation algorithm and the stochastic gradient descent algorithm in deep learning. For conveniently facilitate the proposed algorithms, we rewite the R package ‘neuralnet’ and extend it to ‘supneuralnet’. All kinds of deep learning algorithms in this paper are included in ‘supneuralnet’ package. Finally, we show our algorithms are superior to other algorithms in four case studies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度学习中三步加速梯度算法的研究
梯度下降算法(GD)是一种广泛应用于机器学习和深度学习模型训练的优化方法。本文基于GD、Polyak动量(PM)和Nesterov加速梯度(NAG),给出了算法从目标函数的初始值到最优值的简单二次型收敛性。基于神经网络中二次函数、NAG迭代的两个姊妹序列和并行切线方法的收敛性,提出了三步加速梯度(TAG)算法,该算法具有除两个姊妹序列外的三个序列。为了说明该算法的性能,我们将该算法与其他三种算法在二次函数、高维二次函数和非二次函数方面进行了比较。然后,我们考虑将TAG算法与深度学习中的反向传播算法和随机梯度下降算法相结合。为了方便所提出的算法,我们重写了R包' neuralnet '并将其扩展为' supneuralnet '。本文中的各种深度学习算法都包含在“supneuralnet”包中。最后,我们在四个案例研究中证明了我们的算法优于其他算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
0.90
自引率
20.00%
发文量
21
期刊最新文献
Multiply robust estimation for average treatment effect among treated Communication-efficient distributed statistical inference on zero-inflated Poisson models FragmGAN: generative adversarial nets for fragmentary data imputation and prediction Log-rank and stratified log-rank tests Autoregressive moving average model for matrix time series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1