Improving Convolutional Neural Network Using Pseudo Derivative ReLU

2018 5th International Conference on Systems and Informatics (ICSAI) Pub Date : 2018-11-01 DOI:10.1109/ICSAI.2018.8599372

Zheng Hu, Yongping Li, Zhiyong Yang

引用次数: 16

Abstract

Rectified linear unit (ReLU) is a widely used activation function in artificial neural networks, it is considered to be an efficient active function benefit from its simplicity and nonlinearity. However, ReLU’s derivative for negative inputs is zero, which can make some ReLUs inactive for essentially all inputs during the training. There are several ReLU variations for solving this problem. Comparing with ReLU, they are slightly different in form, and bring other drawbacks like more expensive in computation. In this study, pseudo derivatives were tried replacing original derivative of ReLU while ReLU itself was unchanged. The pseudo derivative was designed to alleviate the zero derivative problem and be consistent with original derivative in general. Experiments showed using pseudo derivative ReLU (PD-ReLU) could obviously improve AlexNet (a typical convolutional neural network model) in CIFAR-10 and CIFAR-100 tests. Furthermore, some empirical criteria for designing such pseudo derivatives were proposed.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用伪导数ReLU改进卷积神经网络

整流线性单元(ReLU)是一种广泛应用于人工神经网络的激活函数，由于其简单性和非线性性被认为是一种高效的激活函数。然而，对于负输入，ReLU的导数为零，这可能会使一些ReLU在训练过程中对所有输入都不活跃。有几个ReLU变体可以解决这个问题。与ReLU相比，它们在形式上略有不同，并带来其他缺点，如计算成本更高。在本研究中，在ReLU本身不变的情况下，尝试用伪衍生物替代ReLU的原衍生物。伪导数的设计是为了缓解零导数问题，并在总体上与原导数保持一致。实验表明，在CIFAR-10和CIFAR-100测试中，使用伪导数ReLU (PD-ReLU)可以明显改善AlexNet(典型的卷积神经网络模型)。在此基础上，提出了设计伪导数的经验准则。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 5th International Conference on Systems and Informatics (ICSAI)

自引率

0.00%

发文量