Stochastic diagonal approximate greatest descent in convolutional neural networks

H. Tan, K. Lim, H. Harno
{"title":"Stochastic diagonal approximate greatest descent in convolutional neural networks","authors":"H. Tan, K. Lim, H. Harno","doi":"10.1109/ICSIPA.2017.8120653","DOIUrl":null,"url":null,"abstract":"Deep structured of Convolutional Neural Networks (CNN) has recently gained intense attention in development due to its good performance in object recognition. One of the crucial components in CNN is the learning mechanism of weight parameters through backpropagation. In this paper, stochastic diagonal Approximate Greatest Descent (SDAGD) is proposed to train weight parameters in CNN. SDAGD adopts the concept of multistage control system and diagonal Hessian approximation for weight optimization. It can be defined into two-phase optimization. In phase 1, when an initial guessing point is far from the solution, SDAGD constructs local search regions to determine the step length of next iteration at the boundary of search region. Subsequently, when the solution is at the final search region, SDAGD will shift to phase 2 by approximating Newton method to obtain a fast weight convergence. The calculation of Hessian in diagonal approximation results in less computational cost as compared to full Hessian calculation. The experiment showed that SDAGD learning algorithm could achieve misclassification rate of 8.85% on MNIST dataset.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSIPA.2017.8120653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Deep structured of Convolutional Neural Networks (CNN) has recently gained intense attention in development due to its good performance in object recognition. One of the crucial components in CNN is the learning mechanism of weight parameters through backpropagation. In this paper, stochastic diagonal Approximate Greatest Descent (SDAGD) is proposed to train weight parameters in CNN. SDAGD adopts the concept of multistage control system and diagonal Hessian approximation for weight optimization. It can be defined into two-phase optimization. In phase 1, when an initial guessing point is far from the solution, SDAGD constructs local search regions to determine the step length of next iteration at the boundary of search region. Subsequently, when the solution is at the final search region, SDAGD will shift to phase 2 by approximating Newton method to obtain a fast weight convergence. The calculation of Hessian in diagonal approximation results in less computational cost as compared to full Hessian calculation. The experiment showed that SDAGD learning algorithm could achieve misclassification rate of 8.85% on MNIST dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
卷积神经网络的随机对角近似最大下降
深层结构卷积神经网络(CNN)由于其在物体识别方面的良好表现,近年来得到了广泛的关注。通过反向传播的权参数学习机制是CNN的关键组成部分之一。本文提出了随机对角近似最大下降法(SDAGD)来训练CNN的权值参数。SDAGD采用多级控制系统的概念,采用对角黑森近似进行权值优化。可定义为两阶段优化。在阶段1中,当初始猜测点离解较远时,SDAGD构建局部搜索区域,在搜索区域边界处确定下一次迭代的步长。随后,当解在最终搜索区域时,SDAGD将通过近似牛顿法转移到阶段2,以获得快速的权值收敛。与全黑森计算相比,对角近似黑森计算的计算成本更低。实验表明,SDAGD学习算法在MNIST数据集上的误分类率为8.85%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhanced forensic speaker verification using multi-run ICA in the presence of environmental noise and reverberation conditions A real-time multi-class multi-object tracker using YOLOv2 Hybrid neural network and regression tree ensemble pruned by simulated annealing for virtual flow metering application Hybrid DWT and MFCC feature warping for noisy forensic speaker verification in room reverberation A deep architecture for face recognition based on multiple feature extraction techniques
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1