关于使用多面体模型生成的分布式内存通信的运行时成本

2015 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2015-07-20 DOI:10.1109/HPCSim.2015.7237034

Ana Moreton-Fernandez, Arturo González-Escribano, D. Ferraris

{"title":"关于使用多面体模型生成的分布式内存通信的运行时成本","authors":"Ana Moreton-Fernandez, Arturo González-Escribano, D. Ferraris","doi":"10.1109/HPCSim.2015.7237034","DOIUrl":null,"url":null,"abstract":"The polyhedral model can be used to automatically generate distributed-memory communications for affine nested loops. Recently, new communication schemes that reduce the communication volume have been presented. In this paper we study the extra computational effort introduced at run-time by the code generated to manage the communication details across distributed processes. We focus on the most sophisticated communication scheme so far introduced (the FOP scheme). We present an asymptotic cost study of the FOP scheme in terms of two main run-time parameters: The problem size, and the number of processors. Based on this study, we identify scalability limitations in current implementations of these techniques, and propose a simple implementation alternative to eliminate one of them. Experimental results are presented, showing the potential impact on performance of these implementation limitations when using these codes in large parallel systems.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On the run-time cost of distributed-memory communications generated using the polyhedral model\",\"authors\":\"Ana Moreton-Fernandez, Arturo González-Escribano, D. Ferraris\",\"doi\":\"10.1109/HPCSim.2015.7237034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The polyhedral model can be used to automatically generate distributed-memory communications for affine nested loops. Recently, new communication schemes that reduce the communication volume have been presented. In this paper we study the extra computational effort introduced at run-time by the code generated to manage the communication details across distributed processes. We focus on the most sophisticated communication scheme so far introduced (the FOP scheme). We present an asymptotic cost study of the FOP scheme in terms of two main run-time parameters: The problem size, and the number of processors. Based on this study, we identify scalability limitations in current implementations of these techniques, and propose a simple implementation alternative to eliminate one of them. Experimental results are presented, showing the potential impact on performance of these implementation limitations when using these codes in large parallel systems.\",\"PeriodicalId\":134009,\"journal\":{\"name\":\"2015 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSim.2015.7237034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2015.7237034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

多面体模型可用于自动生成仿射嵌套循环的分布式存储通信。近年来，新的通信方案被提出，以减少通信量。在本文中，我们研究了为管理跨分布式进程的通信细节而生成的代码在运行时引入的额外计算工作量。我们将重点介绍目前介绍的最复杂的通信方案(FOP方案)。我们根据两个主要运行时参数:问题大小和处理器数量，给出了FOP方案的渐近成本研究。基于这项研究，我们确定了这些技术当前实现中的可扩展性限制，并提出了一个简单的实现替代方案来消除其中一个限制。实验结果显示，当在大型并行系统中使用这些代码时，这些实现限制对性能的潜在影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

On the run-time cost of distributed-memory communications generated using the polyhedral model

The polyhedral model can be used to automatically generate distributed-memory communications for affine nested loops. Recently, new communication schemes that reduce the communication volume have been presented. In this paper we study the extra computational effort introduced at run-time by the code generated to manage the communication details across distributed processes. We focus on the most sophisticated communication scheme so far introduced (the FOP scheme). We present an asymptotic cost study of the FOP scheme in terms of two main run-time parameters: The problem size, and the number of processors. Based on this study, we identify scalability limitations in current implementations of these techniques, and propose a simple implementation alternative to eliminate one of them. Experimental results are presented, showing the potential impact on performance of these implementation limitations when using these codes in large parallel systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量