Jordan Schmerge, Daniel Mawhirter, Connor Holmes, Jedidiah McClurg, Bo-Zong Wu
{"title":"ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing","authors":"Jordan Schmerge, Daniel Mawhirter, Connor Holmes, Jedidiah McClurg, Bo-Zong Wu","doi":"10.1109/rsdha54838.2021.00010","DOIUrl":null,"url":null,"abstract":"Video processing frequently relies on applying convolutional neural networks (CNNs) for various tasks, including object tracking, real-time action classification, and image recognition. Due to complicated network design, processing even a single frame requires many operations, leading to low throughput and high latency. This process can be parallelized, but since consecutive images have similar content, most of these operations produce identical results, leading to inefficient usage of parallel hardware accelerators. In this paper, we present ELIχR, a software system that systematically addresses this computation redundancy problem in an architecture-independent way, using two key techniques. First, ELIχR implements a lightweight change propagation algorithm to automatically determine which data to recompute for each new frame based on changes in the input. Second, ELIχR implements a dynamic check to further reduce needed computations by leveraging special operators in the model (e.g., ReLU), and trading off accuracy for performance. We evaluate ELIχR on two real-world models, Inception V3 and Resnet-50, and two video streams. We show that ELIχR running on the CPU produces up to 3.49X speedup (1.76X on average) compared with frame sampling, given the same accuracy and real-time processing requirements, and we describe how our approach can be applied in an architecture-independent way to improve CNN performance in heterogeneous systems.","PeriodicalId":119942,"journal":{"name":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/rsdha54838.2021.00010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Video processing frequently relies on applying convolutional neural networks (CNNs) for various tasks, including object tracking, real-time action classification, and image recognition. Due to complicated network design, processing even a single frame requires many operations, leading to low throughput and high latency. This process can be parallelized, but since consecutive images have similar content, most of these operations produce identical results, leading to inefficient usage of parallel hardware accelerators. In this paper, we present ELIχR, a software system that systematically addresses this computation redundancy problem in an architecture-independent way, using two key techniques. First, ELIχR implements a lightweight change propagation algorithm to automatically determine which data to recompute for each new frame based on changes in the input. Second, ELIχR implements a dynamic check to further reduce needed computations by leveraging special operators in the model (e.g., ReLU), and trading off accuracy for performance. We evaluate ELIχR on two real-world models, Inception V3 and Resnet-50, and two video streams. We show that ELIχR running on the CPU produces up to 3.49X speedup (1.76X on average) compared with frame sampling, given the same accuracy and real-time processing requirements, and we describe how our approach can be applied in an architecture-independent way to improve CNN performance in heterogeneous systems.