预训练深度学习模型的层效率与层约简分析

2018 International Conference on System Science and Engineering (ICSSE) Pub Date : 2018-06-01 DOI:10.1109/ICSSE.2018.8520080

B. T. Nugraha, S. Su

{"title":"预训练深度学习模型的层效率与层约简分析","authors":"B. T. Nugraha, S. Su","doi":"10.1109/ICSSE.2018.8520080","DOIUrl":null,"url":null,"abstract":"Recent technologies in the deep learning area enable many industries and practitioners fastening the development processes of their products. However, deep learning still encounters some potential issues like overfitting and huge size. The huge size greatly constrains performance and portability of the deep learning model in embedded devices with limited environments. Due to the paradigm of it mixed with the meaning of “deep” layers, many researchers tend to derive the pre-trained model into building deeper layers to solve their problems without knowing whether they are actually needed or not. To address these issues, we exploit the activation and gradient output and weight in each layer of the pre-trained models to measure its efficiencies. By exploiting them, we estimate the efficiencies using our measurements and compare it with the manual layer reduction to validate the most relevant method. We also use the method for continuous layer reductions for validation. With this approach, we save up to 12x and 26x of the time of one manual layer reduction and re-training on VGG-16 and custom AlexNet respectively.","PeriodicalId":431387,"journal":{"name":"2018 International Conference on System Science and Engineering (ICSSE)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of Layer Efficiency and Layer Reduction on Pre-Trained Deep Learning Models\",\"authors\":\"B. T. Nugraha, S. Su\",\"doi\":\"10.1109/ICSSE.2018.8520080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent technologies in the deep learning area enable many industries and practitioners fastening the development processes of their products. However, deep learning still encounters some potential issues like overfitting and huge size. The huge size greatly constrains performance and portability of the deep learning model in embedded devices with limited environments. Due to the paradigm of it mixed with the meaning of “deep” layers, many researchers tend to derive the pre-trained model into building deeper layers to solve their problems without knowing whether they are actually needed or not. To address these issues, we exploit the activation and gradient output and weight in each layer of the pre-trained models to measure its efficiencies. By exploiting them, we estimate the efficiencies using our measurements and compare it with the manual layer reduction to validate the most relevant method. We also use the method for continuous layer reductions for validation. With this approach, we save up to 12x and 26x of the time of one manual layer reduction and re-training on VGG-16 and custom AlexNet respectively.\",\"PeriodicalId\":431387,\"journal\":{\"name\":\"2018 International Conference on System Science and Engineering (ICSSE)\",\"volume\":\"115 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on System Science and Engineering (ICSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSE.2018.8520080\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE.2018.8520080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度学习领域的最新技术使许多行业和从业者能够加快其产品的开发过程。然而，深度学习仍然会遇到一些潜在的问题，如过拟合和庞大的规模。巨大的尺寸极大地限制了深度学习模型在有限环境下的嵌入式设备中的性能和可移植性。由于它的范式与“深层”层的含义混合在一起，许多研究人员倾向于在不知道是否实际需要的情况下，将预训练的模型导出到构建更深层来解决他们的问题。为了解决这些问题，我们利用预训练模型的每一层的激活和梯度输出和权重来衡量其效率。通过利用它们，我们使用我们的测量来估计效率，并将其与手动层减少进行比较，以验证最相关的方法。我们还使用连续层减少的方法进行验证。通过这种方法，我们分别在VGG-16和自定义AlexNet上节省了12倍和26倍的手动层减少和重新训练时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Analysis of Layer Efficiency and Layer Reduction on Pre-Trained Deep Learning Models

Recent technologies in the deep learning area enable many industries and practitioners fastening the development processes of their products. However, deep learning still encounters some potential issues like overfitting and huge size. The huge size greatly constrains performance and portability of the deep learning model in embedded devices with limited environments. Due to the paradigm of it mixed with the meaning of “deep” layers, many researchers tend to derive the pre-trained model into building deeper layers to solve their problems without knowing whether they are actually needed or not. To address these issues, we exploit the activation and gradient output and weight in each layer of the pre-trained models to measure its efficiencies. By exploiting them, we estimate the efficiencies using our measurements and compare it with the manual layer reduction to validate the most relevant method. We also use the method for continuous layer reductions for validation. With this approach, we save up to 12x and 26x of the time of one manual layer reduction and re-training on VGG-16 and custom AlexNet respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 International Conference on System Science and Engineering (ICSSE)

自引率

0.00%

发文量