节能和内存约束深度神经网络的多重复杂性损失dna

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-06-01 DOI:10.1145/3531437.3539720

Matteo Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, D. J. Pagliari

{"title":"节能和内存约束深度神经网络的多重复杂性损失dna","authors":"Matteo Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, D. J. Pagliari","doi":"10.1145/3531437.3539720","DOIUrl":null,"url":null,"abstract":"Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer’s perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2 × with negligible accuracy drop with respect to the baseline.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"107 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks\",\"authors\":\"Matteo Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, D. J. Pagliari\",\"doi\":\"10.1145/3531437.3539720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer’s perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2 × with negligible accuracy drop with respect to the baseline.\",\"PeriodicalId\":116486,\"journal\":{\"name\":\"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design\",\"volume\":\"107 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3531437.3539720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3531437.3539720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

神经架构搜索(NAS)越来越受欢迎，用于自动探索深度学习(DL)架构的准确性与计算复杂性之间的权衡。当目标是小型边缘设备时，深度学习部署的主要挑战是匹配严格的内存约束，因此大多数NAS算法将模型大小作为复杂性度量。其他方法通过权衡准确性和推理操作的数量来减少深度学习模型的能量或延迟。很少同时考虑能量和内存，特别是在低搜索成本的可微分NAS (DNAS)解决方案中。我们克服了这一限制，提出了第一个从设计师的角度直接解决最现实情况的dna:在内存约束下，由目标HW决定的准确性和能量(或延迟)的共同优化。我们通过在训练过程中结合两个复杂度相关的损失函数来实现这一目标，它们具有独立的强度。在MLPerf Tiny基准测试套件中的三个边缘相关任务上进行测试，我们在能量与精度空间中获得了丰富的Pareto架构集，内存占用约束从基准网络的75%到6.25%不等。当部署在商业边缘设备STM NUCLEO-H743ZI2上时，我们的网络在相同内存约束下的能耗范围为2.18倍，精度范围为4.04%，并且减少能量高达2.2倍，相对于基线的精度下降可以忽略不计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks

Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer’s perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2 × with negligible accuracy drop with respect to the baseline.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design

自引率

0.00%

发文量