Lorena Qendro, Jagmohan Chauhan, Alberto Gil C. P. Ramos, C. Mascolo
{"title":"The Benefit of the Doubt: Uncertainty Aware Sensing for Edge Computing Platforms","authors":"Lorena Qendro, Jagmohan Chauhan, Alberto Gil C. P. Ramos, C. Mascolo","doi":"10.1145/3453142.3492330","DOIUrl":null,"url":null,"abstract":"Neural networks (NNs) have drastically improved the performance of mobile and embedded applications but lack measures of “reliability” estimation that would enable reasoning over their predictions. Despite the vital importance, especially in areas of human well-being and health, state-of-the-art uncertainty estimation techniques are computationally expensive when applied to resource-constrained devices. We propose an efficient framework for predictive uncertainty estimation in NNs deployed on edge computing platforms with no need for fine-tuning or re-training strategies. To meet the energy and latency requirements of these systems the framework is built from the ground up to provide predictive uncertainty based only on one forward pass and a negligible amount of additional matrix multiplications. Our aim is to enable already trained deep learning models to generate uncertainty estimates on resource-limited devices at inference time focusing on classification tasks. This framework is founded on theoretical developments casting dropout training as approximate inference in Bayesian NNs. Our novel layerwise distribution approximation to the convolution layer cascades through the network, providing uncertainty estimates in one single run which ensures minimal overhead, especially compared with uncertainty techniques that require multiple forwards passes and an equal linear rise in energy and latency requirements making them unsuitable in practice. We demonstrate that it yields better performance and flexibility over previous work based on multilayer perceptrons to obtain uncertainty estimates. Our evaluation with mobile applications datasets on Nvidia Jetson TX2 and Nano shows that our approach not only obtains robust and accurate uncertainty estimations but also outperforms state-of-the-art methods in terms of systems performance, reducing energy consumption (up to 28–folds), keeping the memory overhead at a minimum while still improving accuracy (up to 16%).","PeriodicalId":6779,"journal":{"name":"2021 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"1 1","pages":"214-227"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453142.3492330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Neural networks (NNs) have drastically improved the performance of mobile and embedded applications but lack measures of “reliability” estimation that would enable reasoning over their predictions. Despite the vital importance, especially in areas of human well-being and health, state-of-the-art uncertainty estimation techniques are computationally expensive when applied to resource-constrained devices. We propose an efficient framework for predictive uncertainty estimation in NNs deployed on edge computing platforms with no need for fine-tuning or re-training strategies. To meet the energy and latency requirements of these systems the framework is built from the ground up to provide predictive uncertainty based only on one forward pass and a negligible amount of additional matrix multiplications. Our aim is to enable already trained deep learning models to generate uncertainty estimates on resource-limited devices at inference time focusing on classification tasks. This framework is founded on theoretical developments casting dropout training as approximate inference in Bayesian NNs. Our novel layerwise distribution approximation to the convolution layer cascades through the network, providing uncertainty estimates in one single run which ensures minimal overhead, especially compared with uncertainty techniques that require multiple forwards passes and an equal linear rise in energy and latency requirements making them unsuitable in practice. We demonstrate that it yields better performance and flexibility over previous work based on multilayer perceptrons to obtain uncertainty estimates. Our evaluation with mobile applications datasets on Nvidia Jetson TX2 and Nano shows that our approach not only obtains robust and accurate uncertainty estimations but also outperforms state-of-the-art methods in terms of systems performance, reducing energy consumption (up to 28–folds), keeping the memory overhead at a minimum while still improving accuracy (up to 16%).