Stochastic diagonal approximate greatest descent in convolutional neural networks

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) Pub Date : 2017-09-01 DOI:10.1109/ICSIPA.2017.8120653

H. Tan, K. Lim, H. Harno

{"title":"Stochastic diagonal approximate greatest descent in convolutional neural networks","authors":"H. Tan, K. Lim, H. Harno","doi":"10.1109/ICSIPA.2017.8120653","DOIUrl":null,"url":null,"abstract":"Deep structured of Convolutional Neural Networks (CNN) has recently gained intense attention in development due to its good performance in object recognition. One of the crucial components in CNN is the learning mechanism of weight parameters through backpropagation. In this paper, stochastic diagonal Approximate Greatest Descent (SDAGD) is proposed to train weight parameters in CNN. SDAGD adopts the concept of multistage control system and diagonal Hessian approximation for weight optimization. It can be defined into two-phase optimization. In phase 1, when an initial guessing point is far from the solution, SDAGD constructs local search regions to determine the step length of next iteration at the boundary of search region. Subsequently, when the solution is at the final search region, SDAGD will shift to phase 2 by approximating Newton method to obtain a fast weight convergence. The calculation of Hessian in diagonal approximation results in less computational cost as compared to full Hessian calculation. The experiment showed that SDAGD learning algorithm could achieve misclassification rate of 8.85% on MNIST dataset.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSIPA.2017.8120653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Deep structured of Convolutional Neural Networks (CNN) has recently gained intense attention in development due to its good performance in object recognition. One of the crucial components in CNN is the learning mechanism of weight parameters through backpropagation. In this paper, stochastic diagonal Approximate Greatest Descent (SDAGD) is proposed to train weight parameters in CNN. SDAGD adopts the concept of multistage control system and diagonal Hessian approximation for weight optimization. It can be defined into two-phase optimization. In phase 1, when an initial guessing point is far from the solution, SDAGD constructs local search regions to determine the step length of next iteration at the boundary of search region. Subsequently, when the solution is at the final search region, SDAGD will shift to phase 2 by approximating Newton method to obtain a fast weight convergence. The calculation of Hessian in diagonal approximation results in less computational cost as compared to full Hessian calculation. The experiment showed that SDAGD learning algorithm could achieve misclassification rate of 8.85% on MNIST dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

卷积神经网络的随机对角近似最大下降

深层结构卷积神经网络(CNN)由于其在物体识别方面的良好表现，近年来得到了广泛的关注。通过反向传播的权参数学习机制是CNN的关键组成部分之一。本文提出了随机对角近似最大下降法(SDAGD)来训练CNN的权值参数。SDAGD采用多级控制系统的概念，采用对角黑森近似进行权值优化。可定义为两阶段优化。在阶段1中，当初始猜测点离解较远时，SDAGD构建局部搜索区域，在搜索区域边界处确定下一次迭代的步长。随后，当解在最终搜索区域时，SDAGD将通过近似牛顿法转移到阶段2，以获得快速的权值收敛。与全黑森计算相比，对角近似黑森计算的计算成本更低。实验表明，SDAGD学习算法在MNIST数据集上的误分类率为8.85%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

自引率

0.00%

发文量