Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen
{"title":"在边缘gpu上使用DNN编译器对实时和安全关键系统的影响:定量审计","authors":"Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen","doi":"10.1145/3611016","DOIUrl":null,"url":null,"abstract":"Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Repercussions of Using DNN Compilers on Edge GPUs for Real Time and Safety Critical Systems: A Quantitative Audit\",\"authors\":\"Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen\",\"doi\":\"10.1145/3611016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.\",\"PeriodicalId\":50924,\"journal\":{\"name\":\"ACM Journal on Emerging Technologies in Computing Systems\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Journal on Emerging Technologies in Computing Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3611016\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal on Emerging Technologies in Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3611016","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Repercussions of Using DNN Compilers on Edge GPUs for Real Time and Safety Critical Systems: A Quantitative Audit
Rapid advancements in edge devices has led to large deployment of deep neural network (DNN) based workloads. To utilize the resources at the edge effectively, many DNN compilers are proposed that efficiently map the high level DNN models developed in frameworks like PyTorch, Tensorflow, Caffe etc into minimum deployable lightweight execution engines. For real time applications like ADAS, these compiler optimized engines should give precise, reproducible and predictable inferences, both in-terms of runtime and output consistency. This paper is the first effort in empirically auditing state of the art DNN compilers viz TensorRT, AutoTVM and AutoScheduler. We characterize the NN compilers based on their performance predictability w.r.t inference latency, output reproducibility, hardware utilization. etc and based on that provide various recommendations. Our methodology and findings can potentially help the application developers, in making informed decision about the choice of DNN compiler, in a real time safety critical setting.
期刊介绍:
The Journal of Emerging Technologies in Computing Systems invites submissions of original technical papers describing research and development in emerging technologies in computing systems. Major economic and technical challenges are expected to impede the continued scaling of semiconductor devices. This has resulted in the search for alternate mechanical, biological/biochemical, nanoscale electronic, asynchronous and quantum computing and sensor technologies. As the underlying nanotechnologies continue to evolve in the labs of chemists, physicists, and biologists, it has become imperative for computer scientists and engineers to translate the potential of the basic building blocks (analogous to the transistor) emerging from these labs into information systems. Their design will face multiple challenges ranging from the inherent (un)reliability due to the self-assembly nature of the fabrication processes for nanotechnologies, from the complexity due to the sheer volume of nanodevices that will have to be integrated for complex functionality, and from the need to integrate these new nanotechnologies with silicon devices in the same system.
The journal provides comprehensive coverage of innovative work in the specification, design analysis, simulation, verification, testing, and evaluation of computing systems constructed out of emerging technologies and advanced semiconductors