平均偏移视觉跟踪的剖面驱动数据流优化

Deepayan Bhowmik, A. Wallace, Robert J. Stewart, Xinyuan Qian, G. Michaelson
{"title":"平均偏移视觉跟踪的剖面驱动数据流优化","authors":"Deepayan Bhowmik, A. Wallace, Robert J. Stewart, Xinyuan Qian, G. Michaelson","doi":"10.1109/GlobalSIP.2014.7032066","DOIUrl":null,"url":null,"abstract":"Profile guided optimisation is a common technique used by compilers and runtime systems to shorten execution runtimes and to optimise locality aware scheduling and memory access on heterogeneous hardware platforms. Some profiling tools trace the execution of low level code, whilst others are designed for abstract models of computation to provide rich domain-specific context in profiling reports. We have implemented mean shift, a computer vision tracking algorithm, in the RVC-CAL dataflow language and use both dynamic runtime and static dataflow profiling mechanisms to identify and eliminate bottlenecks in our naive initial version. We use these profiling reports to tune the CPU scheduler reducing runtime by 88%, and to optimise our dataflow implementation that reduces runtime by a further 43% - an overall runtime reduction of 93%. We also assess the portability of our mean shift optimisations by trading off CPU runtime against resource utilisation on FPGAs. Applying all dataflow optimisations reduces FPGA design space significantly, requiring fewer slice LUTs and less block memory.","PeriodicalId":362306,"journal":{"name":"2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Profile driven dataflow optimisation of mean shift visual tracking\",\"authors\":\"Deepayan Bhowmik, A. Wallace, Robert J. Stewart, Xinyuan Qian, G. Michaelson\",\"doi\":\"10.1109/GlobalSIP.2014.7032066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Profile guided optimisation is a common technique used by compilers and runtime systems to shorten execution runtimes and to optimise locality aware scheduling and memory access on heterogeneous hardware platforms. Some profiling tools trace the execution of low level code, whilst others are designed for abstract models of computation to provide rich domain-specific context in profiling reports. We have implemented mean shift, a computer vision tracking algorithm, in the RVC-CAL dataflow language and use both dynamic runtime and static dataflow profiling mechanisms to identify and eliminate bottlenecks in our naive initial version. We use these profiling reports to tune the CPU scheduler reducing runtime by 88%, and to optimise our dataflow implementation that reduces runtime by a further 43% - an overall runtime reduction of 93%. We also assess the portability of our mean shift optimisations by trading off CPU runtime against resource utilisation on FPGAs. Applying all dataflow optimisations reduces FPGA design space significantly, requiring fewer slice LUTs and less block memory.\",\"PeriodicalId\":362306,\"journal\":{\"name\":\"2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GlobalSIP.2014.7032066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GlobalSIP.2014.7032066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

概要文件引导的优化是编译器和运行时系统使用的一种常用技术,用于缩短执行运行时间,并优化异构硬件平台上的位置感知调度和内存访问。一些分析工具跟踪低级代码的执行,而另一些则是为抽象的计算模型而设计的,以便在分析报告中提供丰富的特定于领域的上下文。我们在RVC-CAL数据流语言中实现了mean shift(一种计算机视觉跟踪算法),并使用动态运行时和静态数据流分析机制来识别和消除初始版本中的瓶颈。我们使用这些分析报告来调优CPU调度器,使运行时间减少88%,并优化我们的数据流实现,使运行时间进一步减少43%——总体运行时间减少93%。我们还通过权衡CPU运行时和fpga上的资源利用率来评估平均移位优化的可移植性。应用所有数据流优化大大减少了FPGA设计空间,需要更少的片lut和更少的块内存。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Profile driven dataflow optimisation of mean shift visual tracking
Profile guided optimisation is a common technique used by compilers and runtime systems to shorten execution runtimes and to optimise locality aware scheduling and memory access on heterogeneous hardware platforms. Some profiling tools trace the execution of low level code, whilst others are designed for abstract models of computation to provide rich domain-specific context in profiling reports. We have implemented mean shift, a computer vision tracking algorithm, in the RVC-CAL dataflow language and use both dynamic runtime and static dataflow profiling mechanisms to identify and eliminate bottlenecks in our naive initial version. We use these profiling reports to tune the CPU scheduler reducing runtime by 88%, and to optimise our dataflow implementation that reduces runtime by a further 43% - an overall runtime reduction of 93%. We also assess the portability of our mean shift optimisations by trading off CPU runtime against resource utilisation on FPGAs. Applying all dataflow optimisations reduces FPGA design space significantly, requiring fewer slice LUTs and less block memory.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Competitive design of power allocation strategies for energy harvesting wireless communication systems Correction of over-exposure using color channel correlations Communications meets copula modeling: Non-standard dependence features in wireless fading channels Energy efficient and low complex wireless communication Feasibility of positive secrecy rate in wiretap interference channels
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1