SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters

Dongrui Fan, Wenming Li, Xiaochun Ye, Da Wang, Hao Zhang, Zhimin Tang, Ninghui Sun
{"title":"SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters","authors":"Dongrui Fan, Wenming Li, Xiaochun Ye, Da Wang, Hao Zhang, Zhimin Tang, Ninghui Sun","doi":"10.1109/HPCA.2018.00057","DOIUrl":null,"url":null,"abstract":"Fast-growing high-throughput applications, such as web services, are characterized by high-concurrency processing, hard real-time response, and high-bandwidth memory access. The newly-born applications bring severe challenges to processors in datacenters, both in concurrent processing performance and energy efficiency. To offer a satisfactory quality of services, it is of critical importance to meet these newly emerging demands of high-throughput applications in the future datacenters in a more efficient way. In this paper, we propose a novel architecture, called SmarCo, which allows high-throughput applications to be processed more efficiently in datacenters. Based on the dominant characteristics of high-throughput applications, we implement large-scale many-core architecture with in-pair threads to support high-concurrency processing; we also introduce a hierarchical ring topology and laxity-aware task scheduler to guarantee hard real-time response; furthermore, we propose high-throughput datapath to improve memory access efficiency. We verify the efficiency of SmarCo by using simulators, large-scale FPGA and prototype with TSMC 40-nm technology node. The experimental results show that, compared to Intel Xeon E7-8890V4, SmarCo achieves 10.11X performance improvement and 6.95X energy-efficiency improvement with higher throughput and a better guarantee of real-time response.","PeriodicalId":154694,"journal":{"name":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"203 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2018.00057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Fast-growing high-throughput applications, such as web services, are characterized by high-concurrency processing, hard real-time response, and high-bandwidth memory access. The newly-born applications bring severe challenges to processors in datacenters, both in concurrent processing performance and energy efficiency. To offer a satisfactory quality of services, it is of critical importance to meet these newly emerging demands of high-throughput applications in the future datacenters in a more efficient way. In this paper, we propose a novel architecture, called SmarCo, which allows high-throughput applications to be processed more efficiently in datacenters. Based on the dominant characteristics of high-throughput applications, we implement large-scale many-core architecture with in-pair threads to support high-concurrency processing; we also introduce a hierarchical ring topology and laxity-aware task scheduler to guarantee hard real-time response; furthermore, we propose high-throughput datapath to improve memory access efficiency. We verify the efficiency of SmarCo by using simulators, large-scale FPGA and prototype with TSMC 40-nm technology node. The experimental results show that, compared to Intel Xeon E7-8890V4, SmarCo achieves 10.11X performance improvement and 6.95X energy-efficiency improvement with higher throughput and a better guarantee of real-time response.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SmarCo:适用于数据中心高吞吐量应用的高效多核处理器
快速增长的高吞吐量应用程序(如web服务)具有高并发处理、硬实时响应和高带宽内存访问的特点。新生的应用程序给数据中心的处理器带来了严峻的挑战,无论是并发处理性能还是能源效率。为了提供令人满意的服务质量,在未来的数据中心中以更有效的方式满足这些新出现的高吞吐量应用需求至关重要。在本文中,我们提出了一种新的架构,称为SmarCo,它允许在数据中心中更有效地处理高吞吐量应用程序。基于高吞吐量应用的主要特点,我们实现了大规模的多核架构和成对线程,以支持高并发处理;我们还引入了分层环拓扑和松弛感知任务调度器来保证硬实时响应;此外,我们提出了高吞吐量的数据路径,以提高内存访问效率。通过仿真器、大规模FPGA和台积电40nm工艺节点的原型验证了SmarCo的效率。实验结果表明,与Intel至强E7-8890V4相比,SmarCo的性能提高了10.11倍,能效提高了6.95倍,吞吐量更高,实时响应得到了更好的保证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Record-Replay Architecture as a General Security Framework LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs Secure DIMM: Moving ORAM Primitives Closer to Memory OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1