Demystifying sparse rectified auto-encoders

Proceedings of the 4th Symposium on Information and Communication Technology Pub Date : 2013-12-05 DOI:10.1145/2542050.2542065

Kien Tran, H. Le

{"title":"Demystifying sparse rectified auto-encoders","authors":"Kien Tran, H. Le","doi":"10.1145/2542050.2542065","DOIUrl":null,"url":null,"abstract":"Auto-Encoders can learn features similar to Sparse Coding, but the training can be done efficiently via the back-propagation algorithm as well as the features can be computed quickly for a new input. However, in practice, it is not easy to get Sparse Auto-Encoders working; there are two things that need investigating: sparsity constraint and weight constraint. In this paper, we try to understand the problem of training Sparse Auto-Encoders with L1-norm sparsity penalty, and propose a modified version of Stochastic Gradient Descent algorithm, called Sleep-Wake Stochastic Gradient Descent (SW-SGD), to solve this problem. Here, we focus on Sparse Auto-Encoders with rectified linear units in the hidden layer, called Sparse Rectified Auto-Encoders (SRAEs), because such units compute fast and can produce true sparsity (exact zeros). In addition, we propose a new reasonable way to constrain SRAEs' weights. Experiments on MNIST dataset show that the proposed weight constraint and SW-SGD help SRAEs successfully learn meaningful features that give excellent performance on classification task compared to other Auto-Encoder variants.","PeriodicalId":246033,"journal":{"name":"Proceedings of the 4th Symposium on Information and Communication Technology","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2542050.2542065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Auto-Encoders can learn features similar to Sparse Coding, but the training can be done efficiently via the back-propagation algorithm as well as the features can be computed quickly for a new input. However, in practice, it is not easy to get Sparse Auto-Encoders working; there are two things that need investigating: sparsity constraint and weight constraint. In this paper, we try to understand the problem of training Sparse Auto-Encoders with L1-norm sparsity penalty, and propose a modified version of Stochastic Gradient Descent algorithm, called Sleep-Wake Stochastic Gradient Descent (SW-SGD), to solve this problem. Here, we focus on Sparse Auto-Encoders with rectified linear units in the hidden layer, called Sparse Rectified Auto-Encoders (SRAEs), because such units compute fast and can produce true sparsity (exact zeros). In addition, we propose a new reasonable way to constrain SRAEs' weights. Experiments on MNIST dataset show that the proposed weight constraint and SW-SGD help SRAEs successfully learn meaningful features that give excellent performance on classification task compared to other Auto-Encoder variants.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

揭开稀疏整流自编码器的神秘面纱

自编码器可以学习类似于稀疏编码的特征，但可以通过反向传播算法高效地完成训练，并且可以快速计算新输入的特征。然而，在实践中，稀疏自编码器并不容易工作;有两件事需要研究:稀疏性约束和权重约束。在本文中，我们试图理解具有l1范数稀疏性惩罚的稀疏自编码器的训练问题，并提出了一种改进版本的随机梯度下降算法，称为睡眠-觉醒随机梯度下降(SW-SGD)来解决这个问题。在这里，我们专注于在隐藏层中具有整流线性单元的稀疏自编码器，称为稀疏整流自编码器(SRAEs)，因为这种单元计算速度快，并且可以产生真正的稀疏性(精确零)。此外，我们还提出了一种新的合理的约束srae权值的方法。在MNIST数据集上的实验表明，所提出的权重约束和SW-SGD帮助SRAEs成功学习有意义的特征，与其他Auto-Encoder变体相比，SRAEs在分类任务上表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 4th Symposium on Information and Communication Technology

自引率

0.00%

发文量

期刊最新文献

Toward a practical visual object recognition system P2P shared-caching model: using P2P to improve client-server application performance Modeling and debugging numerical constraints of cyber-physical systems design Iterated local search in nurse rostering problem Towards tangent-linear GPU programs using OpenACC