生成连续过松弛的高效并行代码

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing Pub Date : 1997-12-10 DOI:10.1109/ICAPP.1997.651517

P. Tang

{"title":"生成连续过松弛的高效并行代码","authors":"P. Tang","doi":"10.1109/ICAPP.1997.651517","DOIUrl":null,"url":null,"abstract":"A complete suite of algorithms for parallelizing compilers to generate efficient SPMD code for SOR problems is presented. By applying unimodular transformation before loop tiling and parallelization, the number of messages per iteration per processor is reduced from 3/sup n/-1 in the conventional parallel SOR algorithm to 2/sup n/-1, where n is the dimensionality of the data set. To maintain the memory-scalability, a novel approach to use the local dynamic memory of parallel processors to implement the skewed data set is proposed.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generating efficient parallel code for successive over-relaxation\",\"authors\":\"P. Tang\",\"doi\":\"10.1109/ICAPP.1997.651517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A complete suite of algorithms for parallelizing compilers to generate efficient SPMD code for SOR problems is presented. By applying unimodular transformation before loop tiling and parallelization, the number of messages per iteration per processor is reduced from 3/sup n/-1 in the conventional parallel SOR algorithm to 2/sup n/-1, where n is the dimensionality of the data set. To maintain the memory-scalability, a novel approach to use the local dynamic memory of parallel processors to implement the skewed data set is proposed.\",\"PeriodicalId\":325978,\"journal\":{\"name\":\"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAPP.1997.651517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAPP.1997.651517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

提出了一套完整的并行编译器算法，以生成有效的SPMD代码。通过在循环平铺和并行化之前应用单模变换，每个处理器每次迭代的消息数量从传统并行SOR算法中的3/sup n/-1减少到2/sup n/-1，其中n是数据集的维数。为了保持内存的可扩展性，提出了一种利用并行处理器的局部动态内存来实现倾斜数据集的新方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Generating efficient parallel code for successive over-relaxation

A complete suite of algorithms for parallelizing compilers to generate efficient SPMD code for SOR problems is presented. By applying unimodular transformation before loop tiling and parallelization, the number of messages per iteration per processor is reduced from 3/sup n/-1 in the conventional parallel SOR algorithm to 2/sup n/-1, where n is the dimensionality of the data set. To maintain the memory-scalability, a novel approach to use the local dynamic memory of parallel processors to implement the skewed data set is proposed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing

自引率

0.00%

发文量