Grouped Pointwise Convolutions Reduce Parameters in Convolutional Neural Networks

Mendel Pub Date : 2022-06-30 DOI:10.13164/mendel.2022.1.023

Joao Paulo Schwarz Schuler, S. Romaní, M. Abdel-Nasser, Hatem A. Rashwan, D. Puig

引用次数: 12

Abstract

In DCNNs, the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Our proposal makes pointwise convolutions parameter efficient via grouping filters into parallel branches or groups, where each branch processes a fraction of the input channels. However, by doing so, the learning capability of the DCNN is degraded. To avoid this effect, we suggest interleaving the output of filters from different branches at intermediate layers of consecutive pointwise convolutions. We applied our improvement to the EfficientNet, DenseNet-BC L100, MobileNet and MobileNet V3 Large architectures. We trained these architectures with the CIFAR-10, CIFAR-100, Cropped-PlantDoc and The Oxford-IIIT Pet datasets. When training from scratch, we obtained similar test accuracies to the original EfficientNet and MobileNet V3 Large architectures while saving up to 90% of the parameters and 63% of the flops.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

分组点卷积减少卷积神经网络中的参数

在DCNNs中，由于滤波器数量乘以来自前一层的输入通道数量，点向卷积中的参数数量迅速增长。我们的建议通过将滤波器分组到并行分支或组来提高点卷积参数的效率，其中每个分支处理一小部分输入通道。然而，这样做会降低DCNN的学习能力。为了避免这种影响，我们建议在连续点向卷积的中间层中，将来自不同分支的滤波器的输出相互交错。我们将我们的改进应用到EfficientNet、DenseNet-BC L100、MobileNet和MobileNet V3 Large架构上。我们使用CIFAR-10、CIFAR-100、croped - plantdoc和the Oxford-IIIT Pet数据集训练这些架构。当从头开始训练时，我们获得了与原始的EfficientNet和MobileNet V3 Large架构相似的测试精度，同时节省了高达90%的参数和63%的失败。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Mendel Decision Sciences-Decision Sciences (miscellaneous)

CiteScore

2.20

自引率

0.00%

发文量