在边缘gpu上使用DNN编译器对实时和安全关键系统的影响:定量审计

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-08-03 DOI:10.1145/3611016

Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen

{"title":"在边缘gpu上使用DNN编译器对实时和安全关键系统的影响:定量审计","authors":"Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen","doi":"10.1145/3611016","DOIUrl":null,"url":null,"abstract":"Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Repercussions of Using DNN Compilers on Edge GPUs for Real Time and Safety Critical Systems: A Quantitative Audit\",\"authors\":\"Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen\",\"doi\":\"10.1145/3611016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.\",\"PeriodicalId\":50924,\"journal\":{\"name\":\"ACM Journal on Emerging Technologies in Computing Systems\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2023-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Journal on Emerging Technologies in Computing Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3611016\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Emerging Technologies in Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3611016","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 1

摘要

边缘设备的快速发展导致了基于深度神经网络(DNN)工作负载的大规模部署。为了有效地利用边缘资源，提出了许多DNN编译器，它们可以有效地将在PyTorch, Tensorflow, Caffe等框架中开发的高级DNN模型映射到最小可部署的轻量级执行引擎中。对于像ADAS这样的实时应用程序，这些编译器优化引擎应该在运行时和输出一致性方面提供精确的、可重复的和可预测的推断。本文是对最先进的DNN编译器(TensorRT, AutoTVM和AutoScheduler)进行经验审计的第一次努力。我们根据神经网络编译器的性能可预测性、推理延迟、输出再现性和硬件利用率来描述它们。等等，并在此基础上提供各种建议。我们的方法和发现可以潜在地帮助应用程序开发人员在实时安全关键设置中做出关于DNN编译器选择的明智决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Repercussions of Using DNN Compilers on Edge GPUs for Real Time and Safety Critical Systems: A Quantitative Audit

Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Journal on Emerging Technologies in Computing Systems 工程技术-工程：电子与电气

CiteScore

4.80

自引率

4.50%

发文量

审稿时长

3 months

期刊介绍： The Journal of Emerging Technologies in Computing Systems invites submissions of original technical papers describing research and development in emerging technologies in computing systems. Major economic and technical challenges are expected to impede the continued scaling of semiconductor devices. This has resulted in the search for alternate mechanical, biological/biochemical, nanoscale electronic, asynchronous and quantum computing and sensor technologies. As the underlying nanotechnologies continue to evolve in the labs of chemists, physicists, and biologists, it has become imperative for computer scientists and engineers to translate the potential of the basic building blocks (analogous to the transistor) emerging from these labs into information systems. Their design will face multiple challenges ranging from the inherent (un)reliability due to the self-assembly nature of the fabrication processes for nanotechnologies, from the complexity due to the sheer volume of nanodevices that will have to be integrated for complex functionality, and from the need to integrate these new nanotechnologies with silicon devices in the same system. The journal provides comprehensive coverage of innovative work in the specification, design analysis, simulation, verification, testing, and evaluation of computing systems constructed out of emerging technologies and advanced semiconductors