面向大数据加速的多核平台低耗能绘图引擎

A. Kulkarni, Tahmid Abtahi, E. Smith, T. Mohsenin
{"title":"面向大数据加速的多核平台低耗能绘图引擎","authors":"A. Kulkarni, Tahmid Abtahi, E. Smith, T. Mohsenin","doi":"10.1145/2902961.2902984","DOIUrl":null,"url":null,"abstract":"Almost 90% of the data available today was created within the last couple of years, thus Big Data set processing is of utmost importance. Many solutions have been investigated to increase processing speed and memory capacity, however I/O bottleneck is still a critical issue. To tackle this issue we adopt Sketching technique to reduce data communications. Reconstruction of the sketched matrix is performed using Orthogonal Matching Pursuit (OMP). Additionally we propose Gradient Descent OMP (GD-OMP) algorithm to reduce hardware complexity. Big data processing at real-time imposes rigid constraints on sketching kernel, hence to further reduce hardware overhead both algorithms are implemented on a low power domain specific many-core platform called Power Efficient Nano Clusters (PENC). GD-OMP algorithm is evaluated for image reconstruction accuracy and the PENC many-core architecture. Implementation results show that for large matrix sizes GD-OMP algorithm is 1.3× faster and consumes 1.4× less energy than OMP algorithm implementations. Compared to GPU and Quad-Core CPU implementations the PENC many-core reconstructs 5.4× and 9.8× faster respectively for large signal sizes with higher sparsity.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Low energy sketching engines on many-core platform for big data acceleration\",\"authors\":\"A. Kulkarni, Tahmid Abtahi, E. Smith, T. Mohsenin\",\"doi\":\"10.1145/2902961.2902984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Almost 90% of the data available today was created within the last couple of years, thus Big Data set processing is of utmost importance. Many solutions have been investigated to increase processing speed and memory capacity, however I/O bottleneck is still a critical issue. To tackle this issue we adopt Sketching technique to reduce data communications. Reconstruction of the sketched matrix is performed using Orthogonal Matching Pursuit (OMP). Additionally we propose Gradient Descent OMP (GD-OMP) algorithm to reduce hardware complexity. Big data processing at real-time imposes rigid constraints on sketching kernel, hence to further reduce hardware overhead both algorithms are implemented on a low power domain specific many-core platform called Power Efficient Nano Clusters (PENC). GD-OMP algorithm is evaluated for image reconstruction accuracy and the PENC many-core architecture. Implementation results show that for large matrix sizes GD-OMP algorithm is 1.3× faster and consumes 1.4× less energy than OMP algorithm implementations. Compared to GPU and Quad-Core CPU implementations the PENC many-core reconstructs 5.4× and 9.8× faster respectively for large signal sizes with higher sparsity.\",\"PeriodicalId\":407054,\"journal\":{\"name\":\"2016 International Great Lakes Symposium on VLSI (GLSVLSI)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Great Lakes Symposium on VLSI (GLSVLSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2902961.2902984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2902961.2902984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

如今,几乎90%的可用数据都是在过去几年内创建的,因此大数据集处理至关重要。已经研究了许多解决方案来提高处理速度和内存容量,但是I/O瓶颈仍然是一个关键问题。为了解决这个问题,我们采用了草图技术来减少数据通信。利用正交匹配追踪(OMP)对绘制好的矩阵进行重构。此外,我们提出梯度下降OMP (GD-OMP)算法来降低硬件复杂度。实时大数据处理对绘制内核施加了严格的约束,因此为了进一步减少硬件开销,这两种算法都在低功耗领域特定的多核平台上实现,称为功率高效纳米集群(PENC)。对GD-OMP算法的图像重建精度和PENC多核结构进行了评价。实现结果表明,对于大矩阵,GD-OMP算法比OMP算法实现速度快1.3倍,能耗低1.4倍。与GPU和四核CPU实现相比,对于具有更高稀疏性的大信号大小,PENC多核重构速度分别快5.4倍和9.8倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Low energy sketching engines on many-core platform for big data acceleration
Almost 90% of the data available today was created within the last couple of years, thus Big Data set processing is of utmost importance. Many solutions have been investigated to increase processing speed and memory capacity, however I/O bottleneck is still a critical issue. To tackle this issue we adopt Sketching technique to reduce data communications. Reconstruction of the sketched matrix is performed using Orthogonal Matching Pursuit (OMP). Additionally we propose Gradient Descent OMP (GD-OMP) algorithm to reduce hardware complexity. Big data processing at real-time imposes rigid constraints on sketching kernel, hence to further reduce hardware overhead both algorithms are implemented on a low power domain specific many-core platform called Power Efficient Nano Clusters (PENC). GD-OMP algorithm is evaluated for image reconstruction accuracy and the PENC many-core architecture. Implementation results show that for large matrix sizes GD-OMP algorithm is 1.3× faster and consumes 1.4× less energy than OMP algorithm implementations. Compared to GPU and Quad-Core CPU implementations the PENC many-core reconstructs 5.4× and 9.8× faster respectively for large signal sizes with higher sparsity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Concurrent error detection for reliable SHA-3 design Task-resource co-allocation for hotspot minimization in heterogeneous many-core NoCs Multiple attempt write strategy for low energy STT-RAM An enhanced analytical electrical masking model for multiple event transients A novel on-chip impedance calibration method for LPDDR4 interface between DRAM and AP/SoC
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1