移动计算机视觉中卷积神经网络的延迟和吞吐量表征

Jussi Hanhirova, Teemu Kämäräinen, S. Seppälä, M. Siekkinen, V. Hirvisalo, Antti Ylä-Jääski
{"title":"移动计算机视觉中卷积神经网络的延迟和吞吐量表征","authors":"Jussi Hanhirova, Teemu Kämäräinen, S. Seppälä, M. Siekkinen, V. Hirvisalo, Antti Ylä-Jääski","doi":"10.1145/3204949.3204975","DOIUrl":null,"url":null,"abstract":"We study performance characteristics of convolutional neural networks (CNN) for mobile computer vision systems. CNNs have proven to be a powerful and efficient approach to implement such systems. However, the system performance depends largely on the utilization of hardware accelerators, which are able to speed up the execution of the underlying mathematical operations tremendously through massive parallelism. Our contribution is performance characterization of multiple CNN-based models for object recognition and detection with several different hardware platforms and software frameworks, using both local (on-device) and remote (network-side server) computation. The measurements are conducted using real workloads and real processing platforms. On the platform side, we concentrate especially on TensorFlow and TensorRT. Our measurements include embedded processors found on mobile devices and high-performance processors that can be used on the network side of mobile systems. We show that there exists significant latency-throughput trade-offs but the behavior is very complex. We demonstrate and discuss several factors that affect the performance and yield this complex behavior.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"79","resultStr":"{\"title\":\"Latency and throughput characterization of convolutional neural networks for mobile computer vision\",\"authors\":\"Jussi Hanhirova, Teemu Kämäräinen, S. Seppälä, M. Siekkinen, V. Hirvisalo, Antti Ylä-Jääski\",\"doi\":\"10.1145/3204949.3204975\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study performance characteristics of convolutional neural networks (CNN) for mobile computer vision systems. CNNs have proven to be a powerful and efficient approach to implement such systems. However, the system performance depends largely on the utilization of hardware accelerators, which are able to speed up the execution of the underlying mathematical operations tremendously through massive parallelism. Our contribution is performance characterization of multiple CNN-based models for object recognition and detection with several different hardware platforms and software frameworks, using both local (on-device) and remote (network-side server) computation. The measurements are conducted using real workloads and real processing platforms. On the platform side, we concentrate especially on TensorFlow and TensorRT. Our measurements include embedded processors found on mobile devices and high-performance processors that can be used on the network side of mobile systems. We show that there exists significant latency-throughput trade-offs but the behavior is very complex. We demonstrate and discuss several factors that affect the performance and yield this complex behavior.\",\"PeriodicalId\":141196,\"journal\":{\"name\":\"Proceedings of the 9th ACM Multimedia Systems Conference\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"79\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th ACM Multimedia Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3204949.3204975\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th ACM Multimedia Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3204949.3204975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 79

摘要

我们研究了卷积神经网络(CNN)在移动计算机视觉系统中的性能特征。cnn已经被证明是实现这种系统的一种强大而有效的方法。然而,系统性能在很大程度上取决于硬件加速器的使用,硬件加速器能够通过大规模并行性极大地加快底层数学运算的执行。我们的贡献是在几种不同的硬件平台和软件框架下,使用本地(设备上)和远程(网络端服务器)计算,对多个基于cnn的对象识别和检测模型进行性能表征。这些测量是使用真实的工作负载和真实的处理平台进行的。在平台方面,我们特别关注TensorFlow和TensorRT。我们的测量包括移动设备上的嵌入式处理器和可用于移动系统网络端的高性能处理器。我们表明存在显著的延迟-吞吐量权衡,但行为非常复杂。我们演示并讨论了影响性能和产生这种复杂行为的几个因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Latency and throughput characterization of convolutional neural networks for mobile computer vision
We study performance characteristics of convolutional neural networks (CNN) for mobile computer vision systems. CNNs have proven to be a powerful and efficient approach to implement such systems. However, the system performance depends largely on the utilization of hardware accelerators, which are able to speed up the execution of the underlying mathematical operations tremendously through massive parallelism. Our contribution is performance characterization of multiple CNN-based models for object recognition and detection with several different hardware platforms and software frameworks, using both local (on-device) and remote (network-side server) computation. The measurements are conducted using real workloads and real processing platforms. On the platform side, we concentrate especially on TensorFlow and TensorRT. Our measurements include embedded processors found on mobile devices and high-performance processors that can be used on the network side of mobile systems. We show that there exists significant latency-throughput trade-offs but the behavior is very complex. We demonstrate and discuss several factors that affect the performance and yield this complex behavior.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Visual object tracking in a parking garage using compressed domain analysis ISIFT VideoNOC OpenCV.js: computer vision processing for the open web platform Subdiv17
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1