通过权值修剪和量化加速卷积神经网络的运行时间

Rajai Alhimdiat, W. Ashour, Ramy Battrawy, D. Stricker
{"title":"通过权值修剪和量化加速卷积神经网络的运行时间","authors":"Rajai Alhimdiat, W. Ashour, Ramy Battrawy, D. Stricker","doi":"10.1109/ieCRES57315.2023.10209460","DOIUrl":null,"url":null,"abstract":"Accelerating the processing of Convolutional Neural Networks (CNNs) is highly demand in the field of Artificial Intelligence (AI), particularly in computer vision domains. The efficiency of memory resources is crucial in measuring run-time, and weight pruning and quantization techniques have been studied extensively to optimize this efficiency. In this work, we investigate the contribution of these techniques to accelerate a pre-trained CNN model. We adapt the percentile-based weights pruning with focusing on unstructured pruning by dynamically adjusting the pruning thresholds based on the fine-tuning performance of the model. In the same context, we perform uniform quantization for presenting the weights values of the model’s parameters with a fixed number of bits. We implement different levels of post-training and aware-training -fine-tuning the model with the same learning rate and number of epochs as the original. We then refine-tune the model with a lower learning rate and a factor of 10x for both techniques. Finally, we combine the best levels of pruning and quantization and refine-tune the model to explore the best-pruned and quantized pre-trained model. We evaluate each level of the techniques and analyze their trade-offs. Our results demonstrate the effectiveness of our strategy in accelerating the CNN and improving its efficiency, and provide insights into the best combination of techniques to accelerate its inference time.","PeriodicalId":431920,"journal":{"name":"2023 8th International Engineering Conference on Renewable Energy & Sustainability (ieCRES)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating the Run-Time of Convolutional Neural Networks through Weight Pruning and Quantization\",\"authors\":\"Rajai Alhimdiat, W. Ashour, Ramy Battrawy, D. Stricker\",\"doi\":\"10.1109/ieCRES57315.2023.10209460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accelerating the processing of Convolutional Neural Networks (CNNs) is highly demand in the field of Artificial Intelligence (AI), particularly in computer vision domains. The efficiency of memory resources is crucial in measuring run-time, and weight pruning and quantization techniques have been studied extensively to optimize this efficiency. In this work, we investigate the contribution of these techniques to accelerate a pre-trained CNN model. We adapt the percentile-based weights pruning with focusing on unstructured pruning by dynamically adjusting the pruning thresholds based on the fine-tuning performance of the model. In the same context, we perform uniform quantization for presenting the weights values of the model’s parameters with a fixed number of bits. We implement different levels of post-training and aware-training -fine-tuning the model with the same learning rate and number of epochs as the original. We then refine-tune the model with a lower learning rate and a factor of 10x for both techniques. Finally, we combine the best levels of pruning and quantization and refine-tune the model to explore the best-pruned and quantized pre-trained model. We evaluate each level of the techniques and analyze their trade-offs. Our results demonstrate the effectiveness of our strategy in accelerating the CNN and improving its efficiency, and provide insights into the best combination of techniques to accelerate its inference time.\",\"PeriodicalId\":431920,\"journal\":{\"name\":\"2023 8th International Engineering Conference on Renewable Energy & Sustainability (ieCRES)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 8th International Engineering Conference on Renewable Energy & Sustainability (ieCRES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ieCRES57315.2023.10209460\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 8th International Engineering Conference on Renewable Energy & Sustainability (ieCRES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ieCRES57315.2023.10209460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

加速卷积神经网络(cnn)的处理在人工智能(AI)领域,特别是在计算机视觉领域有着很高的需求。内存资源的效率是衡量运行时间的关键,为了优化内存资源的效率,人们广泛研究了权值修剪和量化技术。在这项工作中,我们研究了这些技术对加速预训练CNN模型的贡献。我们根据模型的微调性能动态调整剪枝阈值,对基于百分位的权值剪枝进行改进,重点关注非结构化剪枝。在相同的上下文中,我们执行统一量化,以固定位数表示模型参数的权重值。我们在与原始模型相同的学习率和epoch数下,实现了不同级别的后训练和意识训练微调模型。然后,我们用较低的学习率和两种技术的10倍因子来优化模型。最后,我们将最佳修剪和量化水平结合起来,对模型进行微调,以探索最佳修剪和量化的预训练模型。我们评估每个级别的技术,并分析它们的权衡。我们的结果证明了我们的策略在加速CNN和提高其效率方面的有效性,并为加速其推理时间的最佳技术组合提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Accelerating the Run-Time of Convolutional Neural Networks through Weight Pruning and Quantization
Accelerating the processing of Convolutional Neural Networks (CNNs) is highly demand in the field of Artificial Intelligence (AI), particularly in computer vision domains. The efficiency of memory resources is crucial in measuring run-time, and weight pruning and quantization techniques have been studied extensively to optimize this efficiency. In this work, we investigate the contribution of these techniques to accelerate a pre-trained CNN model. We adapt the percentile-based weights pruning with focusing on unstructured pruning by dynamically adjusting the pruning thresholds based on the fine-tuning performance of the model. In the same context, we perform uniform quantization for presenting the weights values of the model’s parameters with a fixed number of bits. We implement different levels of post-training and aware-training -fine-tuning the model with the same learning rate and number of epochs as the original. We then refine-tune the model with a lower learning rate and a factor of 10x for both techniques. Finally, we combine the best levels of pruning and quantization and refine-tune the model to explore the best-pruned and quantized pre-trained model. We evaluate each level of the techniques and analyze their trade-offs. Our results demonstrate the effectiveness of our strategy in accelerating the CNN and improving its efficiency, and provide insights into the best combination of techniques to accelerate its inference time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Moving Object Detection Based on Clustering and Event-Based Camera Regression Model for Optimum Solar Collectors’ Tilt Angles in Libya Multi-Objective Optimal Allocation of Hybrid Electric Vehicles Charging Stations and Renewable Distributed Generators into the Distribution System Towards Green and Circular Economy, The Development of Roof Tiles and Bricks from Sand-Plastic Composite Assessment of the THD Problem in the Southern Governorates Desalination Plant
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1