Minnow

Q1 Computer Science ACM Sigplan Notices Pub Date : 2018-11-30 DOI:10.1145/3296957.3173197

Dan Zhang, Xiaoyu Ma, Michael Thomson, Derek Chiou

{"title":"Minnow","authors":"Dan Zhang, Xiaoyu Ma, Michael Thomson, Derek Chiou","doi":"10.1145/3296957.3173197","DOIUrl":null,"url":null,"abstract":"The importance of irregular applications such as graph analytics is rapidly growing with the rise of Big Data. However, parallel graph workloads tend to perform poorly on general-purpose chip multiprocessors (CMPs) due to poor cache locality, low compute intensity, frequent synchronization, uneven task sizes, and dynamic task generation. At high thread counts, execution time is dominated by worklist synchronization overhead and cache misses. Researchers have proposed hardware worklist accelerators to address scheduling costs, but these proposals often harden a specific scheduling policy and do not address high cache miss rates. We address this with Minnow, a technique that augments each core in a CMP with a lightweight Minnow accelerator. Minnow engines offload worklist scheduling from worker threads to improve scalability. The engines also perform worklist-directed prefetching, a technique that exploits knowledge of upcoming tasks to issue nearly perfectly accurate and timely prefetch operations. On a simulated 64-core CMP running a parallel graph benchmark suite, Minnow improves scalability and reduces L2 cache misses from 29 to 1.2 MPKI on average, resulting in 6.01x average speedup over an optimized software baseline for only 1% area overhead.","PeriodicalId":50923,"journal":{"name":"ACM Sigplan Notices","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Sigplan Notices","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3296957.3173197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 2

Abstract

The importance of irregular applications such as graph analytics is rapidly growing with the rise of Big Data. However, parallel graph workloads tend to perform poorly on general-purpose chip multiprocessors (CMPs) due to poor cache locality, low compute intensity, frequent synchronization, uneven task sizes, and dynamic task generation. At high thread counts, execution time is dominated by worklist synchronization overhead and cache misses. Researchers have proposed hardware worklist accelerators to address scheduling costs, but these proposals often harden a specific scheduling policy and do not address high cache miss rates. We address this with Minnow, a technique that augments each core in a CMP with a lightweight Minnow accelerator. Minnow engines offload worklist scheduling from worker threads to improve scalability. The engines also perform worklist-directed prefetching, a technique that exploits knowledge of upcoming tasks to issue nearly perfectly accurate and timely prefetch operations. On a simulated 64-core CMP running a parallel graph benchmark suite, Minnow improves scalability and reduces L2 cache misses from 29 to 1.2 MPKI on average, resulting in 6.01x average speedup over an optimized software baseline for only 1% area overhead.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

小鱼

随着大数据的兴起，图形分析等非常规应用的重要性正在迅速增长。然而，由于缓存局部性差、计算强度低、频繁同步、任务大小不均匀和动态任务生成，并行图工作负载在通用芯片多处理器(cmp)上的性能往往很差。在线程数较高的情况下，执行时间主要由工作列表同步开销和缓存丢失决定。研究人员提出了硬件工作列表加速器来解决调度成本问题，但这些建议通常会强化特定的调度策略，并不能解决高缓存丢失率问题。我们用Minnow解决了这个问题，这是一种用轻量级Minnow加速器增强CMP中的每个核心的技术。Minnow引擎从工作线程中卸载工作列表调度以提高可伸缩性。引擎还执行面向工作列表的预取，这是一种利用即将到来的任务的知识来发出几乎完全准确和及时的预取操作的技术。在运行并行图基准测试套件的模拟64核CMP上，Minnow提高了可伸缩性，并将L2缓存丢失从平均29 MPKI减少到1.2 MPKI，从而在优化软件基线上平均加速6.01倍，而面积开销仅为1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Sigplan Notices 工程技术-计算机：软件工程

CiteScore

4.90

自引率

0.00%

发文量

审稿时长

2-4 weeks

期刊介绍： The ACM Special Interest Group on Programming Languages explores programming language concepts and tools, focusing on design, implementation, practice, and theory. Its members are programming language developers, educators, implementers, researchers, theoreticians, and users. SIGPLAN sponsors several major annual conferences, including the Symposium on Principles of Programming Languages (POPL), the Symposium on Principles and Practice of Parallel Programming (PPoPP), the Conference on Programming Language Design and Implementation (PLDI), the International Conference on Functional Programming (ICFP), the International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), as well as more than a dozen other events of either smaller size or in-cooperation with other SIGs. The monthly "ACM SIGPLAN Notices" publishes proceedings of selected sponsored events and an annual report on SIGPLAN activities. Members receive discounts on conference registrations and free access to ACM SIGPLAN publications in the ACM Digital Library. SIGPLAN recognizes significant research and service contributions of individuals with a variety of awards, supports current members through the Professional Activities Committee, and encourages future programming language enthusiasts with frequent Programming Languages Mentoring Workshops (PLMW).