利用多种硬件限制增强神经架构搜索,以便在微型物联网设备上部署深度学习模型

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Emerging Topics in Computing Pub Date : 2023-10-10 DOI:10.1109/TETC.2023.3322033
Alessio Burrello;Matteo Risso;Beatrice Alessandra Motetti;Enrico Macii;Luca Benini;Daniele Jahier Pagliari
{"title":"利用多种硬件限制增强神经架构搜索,以便在微型物联网设备上部署深度学习模型","authors":"Alessio Burrello;Matteo Risso;Beatrice Alessandra Motetti;Enrico Macii;Luca Benini;Daniele Jahier Pagliari","doi":"10.1109/TETC.2023.3322033","DOIUrl":null,"url":null,"abstract":"The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 3","pages":"780-794"},"PeriodicalIF":5.1000,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Neural Architecture Search With Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices\",\"authors\":\"Alessio Burrello;Matteo Risso;Beatrice Alessandra Motetti;Enrico Macii;Luca Benini;Daniele Jahier Pagliari\",\"doi\":\"10.1109/TETC.2023.3322033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML.\",\"PeriodicalId\":13156,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computing\",\"volume\":\"12 3\",\"pages\":\"780-794\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2023-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10278089/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10278089/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

依赖于物联网(IoT)设备的计算领域迅速激增,因此迫切需要能够在低功耗设备上运行的高效、准确的深度学习(DL)模型。然而,对于典型的物联网终端节点来说,传统的深度学习模型往往过于复杂和计算密集。为了应对这一挑战,神经架构搜索(NAS)已成为一种流行的设计自动化技术,用于共同优化深度神经网络的准确性和复杂性。然而,现有的 NAS 技术需要多次迭代才能生成符合特定硬件约束条件的网络,例如硬件可用的最大内存或目标应用允许的最大延迟。在这项工作中,我们提出了一种新方法,将多个约束条件纳入所谓的可微分 NAS 优化方法中,这样就能在与单次标准训练相当的时间内,一次性生成一个遵守用户定义的内存和延迟约束条件的模型。我们在五个物联网相关基准(包括 MLPerf Tiny 套件和 Tiny ImageNet)上对所提出的方法进行了评估,结果表明,只需一次搜索,就能将内存和延迟分别减少 87.4% 和 54.2%(根据我们的目标定义),同时确保 TinyML 的最先进手工调谐深度神经网络的准确性毫不逊色。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing Neural Architecture Search With Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices
The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Emerging Topics in Computing
IEEE Transactions on Emerging Topics in Computing Computer Science-Computer Science (miscellaneous)
CiteScore
12.10
自引率
5.10%
发文量
113
期刊介绍: IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.
期刊最新文献
Table of Contents Front Cover IEEE Transactions on Emerging Topics in Computing Information for Authors Special Section on Emerging Social Computing DALTON - Deep Local Learning in SNNs via local Weights and Surrogate-Derivative Transfer
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1