ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing

2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA) Pub Date : 2021-11-01 DOI:10.1109/rsdha54838.2021.00010

Jordan Schmerge, Daniel Mawhirter, Connor Holmes, Jedidiah McClurg, Bo-Zong Wu

{"title":"ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing","authors":"Jordan Schmerge, Daniel Mawhirter, Connor Holmes, Jedidiah McClurg, Bo-Zong Wu","doi":"10.1109/rsdha54838.2021.00010","DOIUrl":null,"url":null,"abstract":"Video processing frequently relies on applying convolutional neural networks (CNNs) for various tasks, including object tracking, real-time action classification, and image recognition. Due to complicated network design, processing even a single frame requires many operations, leading to low throughput and high latency. This process can be parallelized, but since consecutive images have similar content, most of these operations produce identical results, leading to inefficient usage of parallel hardware accelerators. In this paper, we present ELIχR, a software system that systematically addresses this computation redundancy problem in an architecture-independent way, using two key techniques. First, ELIχR implements a lightweight change propagation algorithm to automatically determine which data to recompute for each new frame based on changes in the input. Second, ELIχR implements a dynamic check to further reduce needed computations by leveraging special operators in the model (e.g., ReLU), and trading off accuracy for performance. We evaluate ELIχR on two real-world models, Inception V3 and Resnet-50, and two video streams. We show that ELIχR running on the CPU produces up to 3.49X speedup (1.76X on average) compared with frame sampling, given the same accuracy and real-time processing requirements, and we describe how our approach can be applied in an architecture-independent way to improve CNN performance in heterogeneous systems.","PeriodicalId":119942,"journal":{"name":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/rsdha54838.2021.00010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Video processing frequently relies on applying convolutional neural networks (CNNs) for various tasks, including object tracking, real-time action classification, and image recognition. Due to complicated network design, processing even a single frame requires many operations, leading to low throughput and high latency. This process can be parallelized, but since consecutive images have similar content, most of these operations produce identical results, leading to inefficient usage of parallel hardware accelerators. In this paper, we present ELIχR, a software system that systematically addresses this computation redundancy problem in an architecture-independent way, using two key techniques. First, ELIχR implements a lightweight change propagation algorithm to automatically determine which data to recompute for each new frame based on changes in the input. Second, ELIχR implements a dynamic check to further reduce needed computations by leveraging special operators in the model (e.g., ReLU), and trading off accuracy for performance. We evaluate ELIχR on two real-world models, Inception V3 and Resnet-50, and two video streams. We show that ELIχR running on the CPU produces up to 3.49X speedup (1.76X on average) compared with frame sampling, given the same accuracy and real-time processing requirements, and we describe how our approach can be applied in an architecture-independent way to improve CNN performance in heterogeneous systems.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ELIχR:消除基于cnn的视频处理中的计算冗余

视频处理经常依赖于应用卷积神经网络(cnn)来完成各种任务，包括目标跟踪、实时动作分类和图像识别。由于网络设计复杂，即使处理一个帧也需要很多操作，导致低吞吐量和高延迟。这个过程可以并行化，但是由于连续的图像具有相似的内容，大多数这些操作产生相同的结果，导致并行硬件加速器的使用效率低下。在本文中，我们提出了ELIχR，这是一个软件系统，它以一种与体系结构无关的方式系统地解决了这种计算冗余问题，使用了两种关键技术。首先，ELIχR实现了一个轻量级的变化传播算法，根据输入的变化自动确定为每个新帧重新计算哪些数据。其次，ELIχR实现了动态检查，通过利用模型中的特殊运算符(例如，ReLU)进一步减少所需的计算，并在准确性和性能之间进行权衡。我们在两个真实世界的模型上评估ELIχR, Inception V3和Resnet-50，以及两个视频流。我们表明，在相同的精度和实时处理要求下，与帧采样相比，在CPU上运行的ELIχR产生高达3.49倍的加速(平均1.76倍)，并且我们描述了我们的方法如何以一种与架构无关的方式应用于异构系统中以提高CNN性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)

自引率

0.00%

发文量

期刊最新文献

Comparing LLC-Memory Traffic between CPU and GPU Architectures Platform Agnostic Streaming Data Application Performance Models ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing [Copyright notice] Energy Efficient Task Graph Execution Using Compute Unit Masking in GPUs