无需额外计算和结构要求的神经网络模型修剪

IF 2 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer Supported Cooperative Work-The Journal of Collaborative Computing Pub Date : 2023-05-24 DOI:10.1109/CSCWD57460.2023.10152777
Yin Xie, Yigui Luo, Haihong She, Zhaohong Xiang
{"title":"无需额外计算和结构要求的神经网络模型修剪","authors":"Yin Xie, Yigui Luo, Haihong She, Zhaohong Xiang","doi":"10.1109/CSCWD57460.2023.10152777","DOIUrl":null,"url":null,"abstract":"In past work, deep learning researchers always designed hyperparameters such as model structure and learning rate first and then used the training set to train the weights in this model. While unrestricted model structure design leads to massive neuron redundancy in neural network models. By pruning these redundant neurons, not only can the storage be compressed effectively, but also the operation can be accelerated. In this paper, we propose a method to utilize the training set to prune the model structure during training: 1) train the initialized model and bring it to basic convergence; 2) feed the entire training set into the model and calculate the activations of neurons in each layer; 3) calculate the threshold for neuron pruning in each layer according to the pruning ratio, delete neurons whose activation value is lower than the threshold, and correspondingly delete the weights of the upper and lower layers; 4) further train the pruned model so that it eventually converges. This method of deleting redundant neurons not only greatly deletes the parameters in the model but also achieves model acceleration. We applied this method to some mainstream neural network models: VGGNet and ResNet, and achieved good results.","PeriodicalId":51008,"journal":{"name":"Computer Supported Cooperative Work-The Journal of Collaborative Computing","volume":"17 1","pages":"1734-1740"},"PeriodicalIF":2.0000,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neural Network Model Pruning without Additional Computation and Structure Requirements\",\"authors\":\"Yin Xie, Yigui Luo, Haihong She, Zhaohong Xiang\",\"doi\":\"10.1109/CSCWD57460.2023.10152777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In past work, deep learning researchers always designed hyperparameters such as model structure and learning rate first and then used the training set to train the weights in this model. While unrestricted model structure design leads to massive neuron redundancy in neural network models. By pruning these redundant neurons, not only can the storage be compressed effectively, but also the operation can be accelerated. In this paper, we propose a method to utilize the training set to prune the model structure during training: 1) train the initialized model and bring it to basic convergence; 2) feed the entire training set into the model and calculate the activations of neurons in each layer; 3) calculate the threshold for neuron pruning in each layer according to the pruning ratio, delete neurons whose activation value is lower than the threshold, and correspondingly delete the weights of the upper and lower layers; 4) further train the pruned model so that it eventually converges. This method of deleting redundant neurons not only greatly deletes the parameters in the model but also achieves model acceleration. We applied this method to some mainstream neural network models: VGGNet and ResNet, and achieved good results.\",\"PeriodicalId\":51008,\"journal\":{\"name\":\"Computer Supported Cooperative Work-The Journal of Collaborative Computing\",\"volume\":\"17 1\",\"pages\":\"1734-1740\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2023-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Supported Cooperative Work-The Journal of Collaborative Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCWD57460.2023.10152777\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Supported Cooperative Work-The Journal of Collaborative Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/CSCWD57460.2023.10152777","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

在以往的工作中,深度学习研究者总是先设计模型结构、学习率等超参数,然后用训练集来训练模型中的权值。而不受限制的模型结构设计导致神经网络模型中存在大量的神经元冗余。通过对这些冗余神经元进行修剪,不仅可以有效地压缩存储空间,而且可以加快运算速度。本文提出了一种利用训练集对训练过程中的模型结构进行修剪的方法:1)对初始化模型进行训练,使其基本收敛;2)将整个训练集输入到模型中,计算每层神经元的激活;3)根据剪枝比计算每层神经元剪枝的阈值,删除激活值低于阈值的神经元,并相应删除上下两层的权值;4)进一步训练修剪后的模型,使其最终收敛。这种删除冗余神经元的方法不仅大大删除了模型中的参数,而且实现了模型的加速。我们将该方法应用于一些主流神经网络模型:VGGNet和ResNet,并取得了良好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Neural Network Model Pruning without Additional Computation and Structure Requirements
In past work, deep learning researchers always designed hyperparameters such as model structure and learning rate first and then used the training set to train the weights in this model. While unrestricted model structure design leads to massive neuron redundancy in neural network models. By pruning these redundant neurons, not only can the storage be compressed effectively, but also the operation can be accelerated. In this paper, we propose a method to utilize the training set to prune the model structure during training: 1) train the initialized model and bring it to basic convergence; 2) feed the entire training set into the model and calculate the activations of neurons in each layer; 3) calculate the threshold for neuron pruning in each layer according to the pruning ratio, delete neurons whose activation value is lower than the threshold, and correspondingly delete the weights of the upper and lower layers; 4) further train the pruned model so that it eventually converges. This method of deleting redundant neurons not only greatly deletes the parameters in the model but also achieves model acceleration. We applied this method to some mainstream neural network models: VGGNet and ResNet, and achieved good results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Supported Cooperative Work-The Journal of Collaborative Computing
Computer Supported Cooperative Work-The Journal of Collaborative Computing COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
6.40
自引率
4.20%
发文量
31
审稿时长
>12 weeks
期刊介绍: Computer Supported Cooperative Work (CSCW): The Journal of Collaborative Computing and Work Practices is devoted to innovative research in computer-supported cooperative work (CSCW). It provides an interdisciplinary and international forum for the debate and exchange of ideas concerning theoretical, practical, technical, and social issues in CSCW. The CSCW Journal arose in response to the growing interest in the design, implementation and use of technical systems (including computing, information, and communications technologies) which support people working cooperatively, and its scope remains to encompass the multifarious aspects of research within CSCW and related areas. The CSCW Journal focuses on research oriented towards the development of collaborative computing technologies on the basis of studies of actual cooperative work practices (where ‘work’ is used in the wider sense). That is, it welcomes in particular submissions that (a) report on findings from ethnographic or similar kinds of in-depth fieldwork of work practices with a view to their technological implications, (b) report on empirical evaluations of the use of extant or novel technical solutions under real-world conditions, and/or (c) develop technical or conceptual frameworks for practice-oriented computing research based on previous fieldwork and evaluations.
期刊最新文献
Text-based Patient – Doctor Discourse Online And Patients’ Experiences of Empathy Agency, Power and Confrontation: the Role for Socially Engaged Art in CSCW with Rurban Communities in Support of Inclusion Data as Relation: Ontological Trouble in the Data-Driven Public Administration The Avatar Facial Expression Reenactment Method in the Metaverse based on Overall-Local Optical-Flow Estimation and Illumination Difference Investigating Author Research Relatedness through Crowdsourcing: A Replication Study on MTurk
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1