{"title":"A Novel Structure of Convolutional Layers with a Higher Performance-Complexity Ratio for Semantic Segmentation","authors":"Yalong Jiang, Z. Chi","doi":"10.1109/ICARCV.2018.8580632","DOIUrl":null,"url":null,"abstract":"In this paper, we study an important factor that determines the capacity of a CNN model and propose a novel structure of convolutional layers with a higher performance-complexity ratio. Firstly, the relationship of the model capacity and the number of parameters versus segmentation performance is explored. Secondly, a mechanism is proposed to optimize the structure of a CNN model for a specific task. The mechanism also provides better convergence than current state-of-the-art methods for factorizing convolutional layers, such as MobileNet. Thirdly, we propose a measure based on the mutual information between hidden activations and inputs/outputs to compute the capacity of a CNN model. This measure is highly correlated with segmentation performance. Experimental results on the segmentation of the PASCAL Person Parts Dataset show that the linear dependency among convolutional kernels is an important factor determining the capacity of a CNN model. It is also demonstrated that our approach can successfully adjust the model capacity to best match to the complexity of a dataset. The optimized CNN model achieves the similar performance to Deeplab-V2 on the segmentation task with 100 × less parameters, resulting in a significantly improved performance-complexity ratio.","PeriodicalId":395380,"journal":{"name":"2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARCV.2018.8580632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we study an important factor that determines the capacity of a CNN model and propose a novel structure of convolutional layers with a higher performance-complexity ratio. Firstly, the relationship of the model capacity and the number of parameters versus segmentation performance is explored. Secondly, a mechanism is proposed to optimize the structure of a CNN model for a specific task. The mechanism also provides better convergence than current state-of-the-art methods for factorizing convolutional layers, such as MobileNet. Thirdly, we propose a measure based on the mutual information between hidden activations and inputs/outputs to compute the capacity of a CNN model. This measure is highly correlated with segmentation performance. Experimental results on the segmentation of the PASCAL Person Parts Dataset show that the linear dependency among convolutional kernels is an important factor determining the capacity of a CNN model. It is also demonstrated that our approach can successfully adjust the model capacity to best match to the complexity of a dataset. The optimized CNN model achieves the similar performance to Deeplab-V2 on the segmentation task with 100 × less parameters, resulting in a significantly improved performance-complexity ratio.