Dadu-CD:快速高效的内存处理碰撞检测加速器

Yuxin Yang, Xiaoming Chen, Yinhe Han
{"title":"Dadu-CD:快速高效的内存处理碰撞检测加速器","authors":"Yuxin Yang, Xiaoming Chen, Yinhe Han","doi":"10.1109/DAC18072.2020.9218709","DOIUrl":null,"url":null,"abstract":"Collision detection is a fundamental task in motion planning of robotics. Typically, the performance of collision detection is the bottleneck of an entire motion planning, and so does the energy consumption. Several hardware accelerators have been proposed for collision detection, which achieves higher performance and energy efficiency than general-purpose CPUs and GPUs. However, existing accelerators are still facing the limited memory bandwidth bottleneck, due to the large data volume required by the parallel processing cores and the limited DRAM bandwidth. In this work, we propose a novel collision detection accelerator by employing the processing-in-memory technique. We elaborate the in-memory processing architecture to fully utilize the internal bandwidth of DRAM banks. To make the algorithm and hardware suitable for in-memory processing to be highly efficient, a set of innovative software and hardware techniques are also proposed. Compared with a state-of-the-art ASIC-based collision detection accelerator, both performance and energy efficiency of our accelerator are significantly improved.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Dadu-CD: Fast and Efficient Processing-in-Memory Accelerator for Collision Detection\",\"authors\":\"Yuxin Yang, Xiaoming Chen, Yinhe Han\",\"doi\":\"10.1109/DAC18072.2020.9218709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collision detection is a fundamental task in motion planning of robotics. Typically, the performance of collision detection is the bottleneck of an entire motion planning, and so does the energy consumption. Several hardware accelerators have been proposed for collision detection, which achieves higher performance and energy efficiency than general-purpose CPUs and GPUs. However, existing accelerators are still facing the limited memory bandwidth bottleneck, due to the large data volume required by the parallel processing cores and the limited DRAM bandwidth. In this work, we propose a novel collision detection accelerator by employing the processing-in-memory technique. We elaborate the in-memory processing architecture to fully utilize the internal bandwidth of DRAM banks. To make the algorithm and hardware suitable for in-memory processing to be highly efficient, a set of innovative software and hardware techniques are also proposed. Compared with a state-of-the-art ASIC-based collision detection accelerator, both performance and energy efficiency of our accelerator are significantly improved.\",\"PeriodicalId\":428807,\"journal\":{\"name\":\"2020 57th ACM/IEEE Design Automation Conference (DAC)\",\"volume\":\"135 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 57th ACM/IEEE Design Automation Conference (DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DAC18072.2020.9218709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 57th ACM/IEEE Design Automation Conference (DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAC18072.2020.9218709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

碰撞检测是机器人运动规划中的一项基本任务。通常,碰撞检测的性能是整个运动规划的瓶颈,能量消耗也是瓶颈。已经提出了几种用于碰撞检测的硬件加速器,它们比通用cpu和gpu实现了更高的性能和能效。然而,由于并行处理核需要的数据量大,而DRAM带宽有限,现有的加速器仍然面临着有限的内存带宽瓶颈。在这项工作中,我们提出了一种新的碰撞检测加速器,采用内存处理技术。为了充分利用DRAM组的内部带宽,我们详细设计了内存处理架构。为了使算法和硬件更高效地适用于内存处理,本文还提出了一套创新的软硬件技术。与目前最先进的基于asic的碰撞检测加速器相比,我们的加速器的性能和能效都有了显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dadu-CD: Fast and Efficient Processing-in-Memory Accelerator for Collision Detection
Collision detection is a fundamental task in motion planning of robotics. Typically, the performance of collision detection is the bottleneck of an entire motion planning, and so does the energy consumption. Several hardware accelerators have been proposed for collision detection, which achieves higher performance and energy efficiency than general-purpose CPUs and GPUs. However, existing accelerators are still facing the limited memory bandwidth bottleneck, due to the large data volume required by the parallel processing cores and the limited DRAM bandwidth. In this work, we propose a novel collision detection accelerator by employing the processing-in-memory technique. We elaborate the in-memory processing architecture to fully utilize the internal bandwidth of DRAM banks. To make the algorithm and hardware suitable for in-memory processing to be highly efficient, a set of innovative software and hardware techniques are also proposed. Compared with a state-of-the-art ASIC-based collision detection accelerator, both performance and energy efficiency of our accelerator are significantly improved.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FCNNLib: An Efficient and Flexible Convolution Algorithm Library on FPGAs AXI HyperConnect: A Predictable, Hypervisor-level Interconnect for Hardware Accelerators in FPGA SoC Pythia: Intellectual Property Verification in Zero-Knowledge Reuse-trap: Re-purposing Cache Reuse Distance to Defend against Side Channel Leakage Navigator: Dynamic Multi-kernel Scheduling to Improve GPU Performance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1