无pu间模板的DNN加速器设计自动化硬件计算图

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2022-10-29 DOI:10.1145/3508352.3549342

Jun Yu Li, Wei Wang, Wufeng Li

{"title":"无pu间模板的DNN加速器设计自动化硬件计算图","authors":"Jun Yu Li, Wei Wang, Wufeng Li","doi":"10.1145/3508352.3549342","DOIUrl":null,"url":null,"abstract":"Existing deep neural network (DNN) accelerator design automation (ADA) methods adopt architecture templates to predetermine parts of design choices and then explore the left design choices beyond templates. These templates can be classified into intra-PU templates and inter-PU templates according to the architecture hierarchy. Since templates limit the flexibility of ADA, designing effective ADA methods without templates has become an important research topic. Although there have appeared some works to enhance the flexibility of ADA by removing intra-PU templates, to the best of our knowledge no existing works have studied ADA methods without inter-PU templates. ADA with predetermined inter-PU templates is typically inefficient in terms of resource utilization, especially for DNNs with complex topology. In this paper, we propose a novel method, called hardware computation graph (HCG), for ADA without inter-PU templates. Experiments show that HCG method can achieve competitive latency while using only 1.4x ~ 5x fewer on-chip memory, compared with existing state-of-the-art ADA methods.","PeriodicalId":270592,"journal":{"name":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"792 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hardware Computation Graph for DNN Accelerator Design Automation without Inter-PU Templates\",\"authors\":\"Jun Yu Li, Wei Wang, Wufeng Li\",\"doi\":\"10.1145/3508352.3549342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing deep neural network (DNN) accelerator design automation (ADA) methods adopt architecture templates to predetermine parts of design choices and then explore the left design choices beyond templates. These templates can be classified into intra-PU templates and inter-PU templates according to the architecture hierarchy. Since templates limit the flexibility of ADA, designing effective ADA methods without templates has become an important research topic. Although there have appeared some works to enhance the flexibility of ADA by removing intra-PU templates, to the best of our knowledge no existing works have studied ADA methods without inter-PU templates. ADA with predetermined inter-PU templates is typically inefficient in terms of resource utilization, especially for DNNs with complex topology. In this paper, we propose a novel method, called hardware computation graph (HCG), for ADA without inter-PU templates. Experiments show that HCG method can achieve competitive latency while using only 1.4x ~ 5x fewer on-chip memory, compared with existing state-of-the-art ADA methods.\",\"PeriodicalId\":270592,\"journal\":{\"name\":\"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)\",\"volume\":\"792 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3508352.3549342\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508352.3549342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

现有的深度神经网络(DNN)加速器设计自动化(ADA)方法采用架构模板预先确定设计选择的部分，然后在模板之外探索剩余的设计选择。根据体系结构的不同，这些模板可以分为pu内模板和pu间模板。由于模板限制了ADA的灵活性，设计有效的无模板ADA方法已成为重要的研究课题。虽然已经出现了一些通过去除pu内模板来增强ADA灵活性的研究，但据我们所知，目前还没有研究没有pu间模板的ADA方法。具有预先确定的pu间模板的ADA在资源利用方面通常效率低下，特别是对于具有复杂拓扑结构的dnn。在本文中，我们提出了一种新的方法，称为硬件计算图(HCG)，为ADA没有内部pu模板。实验表明，与现有最先进的ADA方法相比，HCG方法可以在只使用1.4 ~ 5倍片上内存的情况下实现竞争性延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Hardware Computation Graph for DNN Accelerator Design Automation without Inter-PU Templates

Existing deep neural network (DNN) accelerator design automation (ADA) methods adopt architecture templates to predetermine parts of design choices and then explore the left design choices beyond templates. These templates can be classified into intra-PU templates and inter-PU templates according to the architecture hierarchy. Since templates limit the flexibility of ADA, designing effective ADA methods without templates has become an important research topic. Although there have appeared some works to enhance the flexibility of ADA by removing intra-PU templates, to the best of our knowledge no existing works have studied ADA methods without inter-PU templates. ADA with predetermined inter-PU templates is typically inefficient in terms of resource utilization, especially for DNNs with complex topology. In this paper, we propose a novel method, called hardware computation graph (HCG), for ADA without inter-PU templates. Experiments show that HCG method can achieve competitive latency while using only 1.4x ~ 5x fewer on-chip memory, compared with existing state-of-the-art ADA methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

自引率

0.00%

发文量