2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)最新文献

英文中文

On mitigating memory bandwidth contention through bandwidth-aware scheduling 通过带宽感知调度减轻内存带宽争用

2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Pub Date : 2010-09-11 DOI: 10.1145/1854273.1854306

Di Xu, Chenggang Wu, P. Yew

Shared-memory multiprocessors have dominated all platforms from high-end to desktop computers. On such platforms, it is well known that the interconnect between the processors and the main memory has become a major bottleneck. The bandwidth-aware job scheduling is an effective and relatively easy-to-implement way to relieve the bandwidth contention. Previous policies understood that bandwidth saturation hurt the throughput of parallel jobs so they scheduled the jobs to let the total bandwidth requirement equal to the system peak bandwidth. However, we found that intra-quantum fine-grained bandwidth contention still happened due to a program's irregular fluctuation in memory access intensity, which is mostly ignored in previous policies. In this paper, we quantify the impact of bandwidth contention on overall performance. We found that concurrent jobs could achieve a higher memory bandwidth utilization at the expense of super-linear performance degradation. Based on such an observation, we proposed a new workload scheduling policy. Its basic idea is that interference due to bandwidth contention could be minimized when bandwidth utilization is maintained at the level of average bandwidth requirement of the workload. Our evaluation is based on both SPEC 2006 and NPB workloads. The evaluation results on randomly generated workloads show that our policy could improve the system throughput by 4.1% on average over the native OS scheduler, and up to 11.7% improvement has been observed.

共享内存多处理器已经主导了从高端电脑到台式电脑的所有平台。在这样的平台上，众所周知，处理器和主存储器之间的互连已经成为一个主要的瓶颈。带宽感知作业调度是一种有效且相对容易实现的缓解带宽争用的方法。以前的策略理解带宽饱和会损害并行作业的吞吐量，因此它们对作业进行调度，使总带宽需求等于系统峰值带宽。然而，我们发现，由于程序在内存访问强度上的不规则波动，仍然会发生量子内细粒度带宽争用，这在以前的策略中大多被忽略。在本文中，我们量化了带宽争用对整体性能的影响。我们发现并发作业可以以超线性性能下降为代价获得更高的内存带宽利用率。基于这种观察，我们提出了一种新的工作负载调度策略。其基本思想是，当带宽利用率保持在工作负载的平均带宽需求水平时，可以最大限度地减少由于带宽争用引起的干扰。我们的评估基于SPEC 2006和NPB工作负载。对随机生成的工作负载的评估结果表明，我们的策略可以比本机操作系统调度器平均提高4.1%的系统吞吐量，并且可以观察到高达11.7%的改进。

{"title":"On mitigating memory bandwidth contention through bandwidth-aware scheduling","authors":"Di Xu, Chenggang Wu, P. Yew","doi":"10.1145/1854273.1854306","DOIUrl":"https://doi.org/10.1145/1854273.1854306","url":null,"abstract":"Shared-memory multiprocessors have dominated all platforms from high-end to desktop computers. On such platforms, it is well known that the interconnect between the processors and the main memory has become a major bottleneck. The bandwidth-aware job scheduling is an effective and relatively easy-to-implement way to relieve the bandwidth contention. Previous policies understood that bandwidth saturation hurt the throughput of parallel jobs so they scheduled the jobs to let the total bandwidth requirement equal to the system peak bandwidth. However, we found that intra-quantum fine-grained bandwidth contention still happened due to a program's irregular fluctuation in memory access intensity, which is mostly ignored in previous policies. In this paper, we quantify the impact of bandwidth contention on overall performance. We found that concurrent jobs could achieve a higher memory bandwidth utilization at the expense of super-linear performance degradation. Based on such an observation, we proposed a new workload scheduling policy. Its basic idea is that interference due to bandwidth contention could be minimized when bandwidth utilization is maintained at the level of average bandwidth requirement of the workload. Our evaluation is based on both SPEC 2006 and NPB workloads. The evaluation results on randomly generated workloads show that our policy could improve the system throughput by 4.1% on average over the native OS scheduler, and up to 11.7% improvement has been observed.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124969246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

The paralax infrastructure: Automatic parallelization with a helping hand 视差基础结构:带有辅助的自动并行化

2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Pub Date : 2010-09-11 DOI: 10.1145/1854273.1854322

H. Vandierendonck, S. Rul, K. D. Bosschere

Speeding up sequential programs on multicores is a challenging problem that is in urgent need of a solution. Automatic parallelization of irregular pointer-intensive codes, exemplified by the SPECint codes, is a very hard problem. This paper shows that, with a helping hand, such auto-parallelization is possible and fruitful.

在多核上加速顺序程序是一个迫切需要解决的具有挑战性的问题。以SPECint代码为例的不规则指针密集代码的自动并行化是一个非常困难的问题。本文表明，在辅助工具的帮助下，这种自动并行化是可能的，并且是富有成效的。

引用次数: 98

Speculative-Aware Execution: A simple and efficient technique for utilizing multi-cores to improve single-thread performance 推测感知执行:一种利用多核来提高单线程性能的简单而有效的技术

2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Pub Date : 2010-09-11 DOI: 10.1145/1854273.1854326

R. Mameesh, M. Franklin

In this paper a new architecture, Speculative-Aware Execution (SAE) is presented that employs speculative-awareness as a means of mitigating the drawbacks of speculative execution which are: useless work (uses speculative values so it produces incorrect results or is done on the wrong path) and redundant work (produces results previously obtained). In order to achieve this, SAE tries to partition the dynamic instruction stream into two disjoint parallel threads: A speculative thread that is partially speculative-aware (p-thread) as it records its speculative state and uses it to avoid useless work (using speculative values) but have no account for its control-flow violations; and a fully speculative-aware thread (f-thread) that has full record of p-thread's speculations, and so can steer p-thread away from incorrect control-flow paths and can accurately identify p-thread's correct work and avoid it, otherwise it would be redundant. By eliminating useless and redundant works, SAE outperforms existing architectures that share similar high-level micro-architecture while incurring only minor hardware additions/changes. Detailed experimental results confirm that SAE indeed reduces the number of useless and redundant computations. We also report an average performance improvement of 18% for the SPEC_INT2000 benchmarks.

本文提出了一种新的体系结构，推测感知执行(SAE)，它采用推测感知作为减轻推测执行缺点的一种手段，这些缺点是:无用的工作(使用推测值，因此产生不正确的结果或在错误的路径上完成)和冗余的工作(产生先前获得的结果)。为了实现这一目标，SAE试图将动态指令流划分为两个不相交的并行线程:一个推测线程(p-thread)部分推测感知，因为它记录其推测状态，并使用它来避免无用的工作(使用推测值)，但不考虑其控制流违规;一个完全投机意识的线程(f-thread)拥有p-thread的投机行为的完整记录，因此可以引导p-thread远离不正确的控制流路径，并可以准确识别p-thread的正确工作并避免它，否则它将是多余的。通过消除无用和冗余的工作，SAE超越了现有的架构，这些架构共享类似的高级微架构，同时只产生少量的硬件添加/更改。详细的实验结果证实，SAE确实减少了无用和冗余的计算次数。我们还报告了SPEC_INT2000基准测试的平均性能提高了18%。

{"title":"Speculative-Aware Execution: A simple and efficient technique for utilizing multi-cores to improve single-thread performance","authors":"R. Mameesh, M. Franklin","doi":"10.1145/1854273.1854326","DOIUrl":"https://doi.org/10.1145/1854273.1854326","url":null,"abstract":"In this paper a new architecture, Speculative-Aware Execution (SAE) is presented that employs speculative-awareness as a means of mitigating the drawbacks of speculative execution which are: useless work (uses speculative values so it produces incorrect results or is done on the wrong path) and redundant work (produces results previously obtained). In order to achieve this, SAE tries to partition the dynamic instruction stream into two disjoint parallel threads: A speculative thread that is partially speculative-aware (p-thread) as it records its speculative state and uses it to avoid useless work (using speculative values) but have no account for its control-flow violations; and a fully speculative-aware thread (f-thread) that has full record of p-thread's speculations, and so can steer p-thread away from incorrect control-flow paths and can accurately identify p-thread's correct work and avoid it, otherwise it would be redundant. By eliminating useless and redundant works, SAE outperforms existing architectures that share similar high-level micro-architecture while incurring only minor hardware additions/changes. Detailed experimental results confirm that SAE indeed reduces the number of useless and redundant computations. We also report an average performance improvement of 18% for the SPEC_INT2000 benchmarks.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126982333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀