优化的可分离卷积：又一种高效的卷积算子

AI Open Pub Date : 2022-01-01 DOI:10.1016/j.aiopen.2022.10.002

Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen

{"title":"优化的可分离卷积：又一种高效的卷积算子","authors":"Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen","doi":"10.1016/j.aiopen.2022.10.002","DOIUrl":null,"url":null,"abstract":"<div>The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math> parameters to represent, where <math><mi>C</mi></math> is the channel size and <math><mi>K</mi></math> is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to <math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mrow><mo>(</mo><mi>C</mi><mo>+</mo><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math> while spatial separable convolution reduces the complexity to <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math>. However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mfrac><mrow><mn>3</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math>. When the restriction in the number of separated convolutions can be lifted, an even lower complexity at <math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mo>log</mo><mrow><mo>(</mo><mi>C</mi><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math> can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.</div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 162-171"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000158/pdfft?md5=53825a8ab2de46247d122c455ee0622b&pid=1-s2.0-S2666651022000158-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Optimized separable convolution: Yet another efficient convolution operator\",\"authors\":\"Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen\",\"doi\":\"10.1016/j.aiopen.2022.10.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math> parameters to represent, where <math><mi>C</mi></math> is the channel size and <math><mi>K</mi></math> is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to <math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mrow><mo>(</mo><mi>C</mi><mo>+</mo><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math> while spatial separable convolution reduces the complexity to <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math>. However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of <math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mfrac><mrow><mn>3</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math>. When the restriction in the number of separated convolutions can be lifted, an even lower complexity at <math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mo>log</mo><mrow><mo>(</mo><mi>C</mi><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math> can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.</div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"3 \",\"pages\":\"Pages 162-171\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666651022000158/pdfft?md5=53825a8ab2de46247d122c455ee0622b&pid=1-s2.0-S2666651022000158-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651022000158\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651022000158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

卷积运算是最近激增的深度学习研究中最关键的组成部分。传统的2D卷积需要O（C2K2）参数来表示，其中C是信道大小，K是核大小。考虑到这些参数最近急剧增加以满足苛刻应用的需求，参数的数量变得非常昂贵。在卷积的各种实现中，可分离卷积已被证明在减小模型大小方面更有效。例如，深度可分离卷积将复杂度降低到O（C·（C+K2）），而空间可分离卷积则将复杂度降至O（C2K）。然而，这些被认为是临时设计，不能确保它们通常能够实现最佳分离。在本研究中，我们提出了一种新的、有原则的算子，称为优化可分离卷积，通过对内部组数和核大小的优化设计，对于一般可分离卷积可以实现O（C32K）的复杂度。当可以取消分离卷积数量的限制时，可以实现O（C·log（CK2））下更低的复杂度。实验结果表明，与传统、深度和深度/空间可分离卷积相比，所提出的优化的可分离卷积能够在精度方面实现改进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Optimized separable convolution: Yet another efficient convolution operator

The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs $O (C^{2} K^{2})$ parameters to represent, where $C$ is the channel size and $K$ is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to $O (C \cdot (C + K^{2}))$ while spatial separable convolution reduces the complexity to $O (C^{2} K)$ . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of $O (C^{\frac{3}{2}} K)$ . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at $O (C \cdot log (C K^{2}))$ can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.