Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen
{"title":"优化的可分离卷积:又一种高效的卷积算子","authors":"Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen","doi":"10.1016/j.aiopen.2022.10.002","DOIUrl":null,"url":null,"abstract":"<div><p>The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> parameters to represent, where <span><math><mi>C</mi></math></span> is the channel size and <span><math><mi>K</mi></math></span> is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mrow><mo>(</mo><mi>C</mi><mo>+</mo><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> while spatial separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called <em>optimized separable convolution</em> by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mfrac><mrow><mn>3</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. When the restriction in the number of separated convolutions can be lifted, an even lower complexity at <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mo>log</mo><mrow><mo>(</mo><mi>C</mi><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 162-171"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000158/pdfft?md5=53825a8ab2de46247d122c455ee0622b&pid=1-s2.0-S2666651022000158-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Optimized separable convolution: Yet another efficient convolution operator\",\"authors\":\"Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen\",\"doi\":\"10.1016/j.aiopen.2022.10.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> parameters to represent, where <span><math><mi>C</mi></math></span> is the channel size and <span><math><mi>K</mi></math></span> is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mrow><mo>(</mo><mi>C</mi><mo>+</mo><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> while spatial separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called <em>optimized separable convolution</em> by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mfrac><mrow><mn>3</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. When the restriction in the number of separated convolutions can be lifted, an even lower complexity at <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mo>log</mo><mrow><mo>(</mo><mi>C</mi><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.</p></div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"3 \",\"pages\":\"Pages 162-171\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666651022000158/pdfft?md5=53825a8ab2de46247d122c455ee0622b&pid=1-s2.0-S2666651022000158-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651022000158\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651022000158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimized separable convolution: Yet another efficient convolution operator
The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs parameters to represent, where is the channel size and is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to while spatial separable convolution reduces the complexity to . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.