深度学习模型训练在嵌入式GPU设备上的适用性:实证研究

Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei
{"title":"深度学习模型训练在嵌入式GPU设备上的适用性:实证研究","authors":"Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei","doi":"10.1109/MECO58584.2023.10155048","DOIUrl":null,"url":null,"abstract":"The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.","PeriodicalId":187825,"journal":{"name":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study\",\"authors\":\"Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei\",\"doi\":\"10.1109/MECO58584.2023.10155048\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.\",\"PeriodicalId\":187825,\"journal\":{\"name\":\"2023 12th Mediterranean Conference on Embedded Computing (MECO)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 12th Mediterranean Conference on Embedded Computing (MECO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MECO58584.2023.10155048\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MECO58584.2023.10155048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度学习技术的广泛应用推动了嵌入式GPU设备和工作站GPU卡在当代工业物联网(IIoT)系统中的应用。由于两种GPU之间的本质差异,深度学习模型训练在其目前的实践中是在GPU卡上运行的,并且最多使用嵌入式GPU设备进行推理或部分模型训练。为了提供经验证据并帮助深度学习工作负载布局的决策,本文报告了一组关于每种GPU类型的时效性和能量效率的实验,同时运行卷积神经网络(CNN)和长短期记忆(LSTM)模型训练。结果表明,尽管响应时间较长,嵌入式gpu确实节省了总能源成本,但节能量在实际意义上可能并不显著。在本文中,我们报告了一个使用LSTM进行预测应用的案例研究。结果表明,相比之下,嵌入式GPU可以节省约90%的能耗,但代价是应用程序响应时间增加一倍。但是,无论是能源成本的节省还是响应时间的增加都不足以影响应用程序。这些发现表明,将模型训练工作量放在工作站GPU或嵌入式GPU上是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study
The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of Blockchain Platforms for Generation and Verification of Diplomas Minimizing the Total Completion Time of Jobs for a Permutation Flow-Shop System Double Buffered Angular Speed Measurement Method for Self-Calibration of Magnetoresistive Sensors Quantum Resilient Public Key Cryptography in Internet of Things Crop yield forecasting with climate data using PCA and Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1